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Complementary  metal  oxide  semiconductor  (CMOS)  image  sensors  have  become 
more  attractive  recently  due  to  their  capabilities  of  offering  system-on-chip  (SoC), 
low  cost  and  low  power  consumption.  Conventional  CMOS  image  sensors,  however, 
have  a limited  dynamic  range  of  65-75dB,  while  a typical  outdoor  scene  has  a dy- 
namic range  of  more  than  lOOdB.  The  purpose  of  this  dissertation  is  to  design  a novel 
CMOS  image  sensor  to  deal  with  such  high  dynamic  range  scenes.  The  dissertation 
begins  with  a review  of  basic  physics  of  photodetectors.  The  following  analysis  of 
conventional  CMOS  imager  principles  reveals  that  a limited  output  signal  swing  and 
a fixed  integration  time  result  in  the  dynamic  range  limitation.  To  circumvent  this 
problem,  we  proposed  a novel  time-to-first-spike  (TTFS)  CMOS  imager,  which  en- 
codes illuminance  with  integration  time  instead  of  reading  out  the  analog  value  at  a 
fixed  integration  time.  The  principle  and  circuit  design  of  this  novel  TTFS  imager  are 
discussed  in  detail.  Our  analysis  has  shown  that  this  TTFS  imager  is  able  to  achieve 
high  dynamic  range  as  well  as  good  signal-to-noise  ratio  performance.  A prototype 
chip  was  fabricated  in  TSMC  0.18/rm  digital  technology  and  fully  tested.  The  testing 
results  verify  the  expected  performance.  Some  optimization  techniques,  e.g.,  optimal 


variation  of  reference  voltage,  and  several  modified  TTFS  imager  designs  are  also 
presented  in  this  dissertation. 
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CHAPTER  1 
INTRODUCTION 

Invented  in  1970,  charge-coupled  devices  (CCDs)  are  presently  the  technology 
of  choice  for  most  imaging  applications  due  to  their  high  sensitivity,  high  quantum 
efficiency  and  large  fill  factor  [1],  The  demand  for  near-perfect  charge  transfer  effi- 
ciency, however,  makes  CCDs  difhcult  to  scale  up  to  very  large  array  size  and  achieve 
high  readout  speed.  Fabricated  in  a special  process,  CCDs  are  incapable  of  integrat- 
ing other  electronic  circuits  on  the  same  chip.  In  addition,  they  need  high  voltage 
operations,  thus  consuming  more  power  [2]. 

The  fast  growing  digital  multimedia  application  market  requires  miniaturized, 
low  cost  and  low  power  consumption  cameras.  This  demand  has  attracted  more 
attention  to  the  design  of  CMOS-based  image  sensors.  CMOS  image  sensors  are  fab- 
ricated in  standard  CMOS  technologies  and  potentially  able  to  integrate  a significant 
amount  of  VLSI  electronics,  e.g.,  timing  control,  analog-to-digital  converters  (ADC) 
and  signal  processing,  into  a single  chip.  Therefore,  they  greatly  reduce  component 
and  package  cost.  Powered  by  standard  logic  supply  voltages,  CMOS  image  sensors 
also  benefit  from  very  low  power  consumption  measured  in  the  tens  of  milliwatts  for 
a 256  X 256  size  array  [3] . 

One  serious  problem  with  CMOS  image  sensors  is  the  limited  dynamic  range 
(DR).  Dynamic  range  is  typically  defined  as  the  ratio  of  the  largest  nonsaturated 
signal  to  the  smallest  measurable  signal.  Conventional  CMOS  image  sensors  are 
usually  limited  to  65-75dB  DR  due  to  the  narrow  signal  swing  and  a single  integration 
time  for  all  pixels,  while  a typical  outdoor  scene  has  a dynamic  range  of  more  than 
lOOdB.  With  standard  CMOS  technology  scaling  down  to  submicron  levels,  DR  tends 
to  be  further  deteriorated  because  of  the  reduced  signal  headroom  and  increased  noise 


1 


2 


floor.  Inspired  by  biological  vision  theory,  we  have  proposed  a time-to-first-spike 
(TTFS)  imager  [4],  Instead  of  reading  out  analog  signals,  we  encode  the  illuminance 
intensities  with  temporal  information.  There  is  not  a fixed  integration  time  as  with 
conventional  CMOS  imagers;  however  the  integration  time  of  each  pixel  varies  with 
respect  to  the  illuminance.  Thus  it  overcomes  the  restriction  of  limited  voltage  swing 
and  extends  the  dynamic  range  in  the  time  domain.  The  early  version  of  the  TTFS 
imager  had  a large  pixel  size  and  a potentially  high  collision  rate  for  a large  size 
array.  In  this  dissertation,  we  will  optimize  our  original  architecture  at  both  system 
and  circuit  levels.  The  dissertation  is  organized  as  follows:  Chapter  2 provides  the 
fundamentals  of  solid-state  image  sensors.  In  Chapter  3,  we  will  review  the  existing 
high  dynamic  range  image  sensors  and  describe  the  principles  of  our  novel  CMOS 
image  sensor.  An  optimal  reference  voltage  variation  strategy  for  TTFS  imagers  is 
presented  in  Chapter  4.  Chapter  5 describes  the  circuit  design  and  presents  testing 
results.  Several  modified  TTFS  imagers  with  low  readout  collision  rate  are  included 
in  Chapter  6.  Finally,  the  dissertation  is  concluded  with  the  future  work  in  Chapter 


7. 


CHAPTER  2 

FUNDAMENTALS  OF  SOLID-STATE  IMAGE  SENSORS 


A solid-state  image  sensor,  an  integrated  circuit  that  contains  a number  of  pho- 
todetectors  in  a 2D  array,  converts  an  optical  signal  into  an  electrical  output.  A basic 
physics  review  of  photodetectors  is  provided  in  this  chapter.  Following  that,  we  will 
briefly  introduce  the  two  most  important  solid-state  imagers:  CCD  and  CMOS  image 
sensors.  A noise  analysis  is  given  at  the  end  of  this  chapter. 

2.1  Physics  Review  of  Solid-State  Photodetectors 
Photodetectors  are  solid-state  devices  that  detect  optical  signals  and  convert 
them  into  electrical  signals  [5].  This  photodetection  process  can  be  summarized  with 
the  following  steps: 

1.  Absorption  of  photons  to  generate  charge  carriers, 

2.  Drift  of  charge  carriers  in  a built-in  electric  field, 

3.  Collection  of  charge  carriers. 

We  will  explain  these  steps  in  the  following  sections. 

2.1.1  Generation  of  Charge  Carriers 

If  the  incident  photons  have  energies  greater  than  the  bandgap  of  semiconductor 
photodetectors,  electron-hole  (e-h)  pairs  are  generated  inside  the  semiconductors. 
Then  the  electrons  in  the  valence  band  are  brought  to  the  conduction  band  while 
leaving  holes  behind.  According  to  Planck’s  relationship,  the  photon  energy  is 

Eph  = hu  = hj  (2.1) 

where  h = 6.626  x 10’'^^  J-s  is  the  Planck  constant,  c = 3x  10*m/s  is  the  speed  of  light, 
and  A is  the  wavelength.  To  generate  the  e-h  pairs,  the  wavelength  of  incident  light 
should  be  shorter  than  A^  [^m]  = he/ Eg  = 1.2A/ Eg[eV].  For  silicon,  Eg  = 1.12eU, 
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the  threshold  wavelength  is  ~ lllOnm,  whereas  for  Ge  with  Eg  = 0.66elA,  the 

corresponding  wavelength  is  « 1880nm.  Since  both  these  threshold  wavelengths 

are  longer  than  the  visible  light  wavelength  that  ranges  from  about  400nm  (violet) 
to  700nm  (red)  [6],  both  silicon  and  germanium  photodetectors  can  be  used  to  sense 
visible  light. 

An  absorbing  material,  e.g.,  silicon,  generally  absorbs  some  of  the  energy  of 
the  incident  light  and  reflects  the  rest.  For  simplicity,  we  assume  here  that  all 
the  incident  light  is  available  for  e-h  pair  generation  and  no  reflection  occurs,  i.e.. 


R = 


reflected  intensity 


0.  Then,  how  much  energy  absorbed  is  decided  by  the 


incident  intensity 

optical  absorption  coefficient  a,  a function  of  wavelength.  Related  to  the  absorption 
coefficient,  the  light  intensity  F{x)  at  the  depth  of  x is  given  by  Beer’s  law 


F{x)  = Foe' 


(2.2) 


where  Fq  is  the  light  intensity  at  the  surface.  Beer’s  law  states  that  the  light  intensity 
exponentially  decreases  inside  the  absorbing  material.  So  alternatively,  the  absorption 
coefficient  is  determined  by  the  penetration  depth  of  the  light,  which  is  defined  as  the 
location  where  the  light  intensity  becomes  1/e  (63%)  of  the  surface  light  intensity,  i.e.. 
Open  = 1/ a.  Typical  absorption  coefficients  of  silicon  shown  in  figure  2-1  indicate  that 
the  penetration  depths  of  violet  light  and  red  light  are  approximately  Open  = 0.17/im 
and  Open  = 40pm  respectively.  This  information  can  be  used  as  the  basis  for  color 
sensors,  e.g.,  the  Foveon  X3  digital  image  sensor  [7]. 

2.1.2  Photon  Collection  and  Quantum  Efficiency 

The  reverse  process  of  e-h  generation  in  a semiconductor  is  recombination,  which 
can  annihilate  e-h  pairs  generated  by  the  incident  light.  Fortunately,  we  can  use  an 
electrical  field  to  effectively  separate  electrons  and  holes  and  cause  the  carriers  to 
reach  some  collection  contacts  [8].  Thus  photocurrent,  Jph,  is  formed,  which  has 
three  components: 
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Figure  2-1:  Absorption  coefficients  of  silicon  (after  Gamal  [6]) 

1.  Current  due  to  the  carriers  generated  in  the  depletion  region  and  swept  away 
by  a strong  electric  field,  J drift-, 

2.  Current  due  to  holes  (minority  carrier)  generated  in  the  n-type  quasi-neutral 
region, 

3.  Current  due  to  electrons  (minority  carrier)  generated  in  the  p-type  quasi-neutral 
region, 

Figure  2-2  illustrates  the  photocurrent  generation  mechanism  in  a pn  junction  photo- 
diode. Assuming  the  n-layer  is  thin  enough  to  cause  negligible  absorption,  i.e.,  xi  = 0 
[9],  and  the  p-type  region  is  the  lower  concentration  part  of  the  junction,  then  the 
total  current  density  through  the  depletion  layer  is 


Assuming  a monochromatic  incident  light  with  intensity  Fq  at  the  surface,  the 
carrier  generation  rate  at  the  depth  of  x is  given  by 


(2.3) 


G{x)  = F{x))  = QF„e-” 


(2.4) 
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Figure  2-2:  Diagram  of  photocurrent  generation 

If  an  abrupt  pn  junction  is  assumed  and  all  the  generated  carriers  in  the  depletion 
layer  are  collected,  the  drift  current  density  is  given  by 

pX2 

Jdr^ft^  / qG{x)dx  = qFo{l-e-^^)  (2.5) 

Jo 

where  g = 1.6  x is  the  electron  charge,  and  IF  = X2  is  the  depletion  width  as 

labelled  in  figure  2-2. 

Unlike  the  drift  current,  the  diffusion  current  in  the  quasi-p  region  is  governed 
by  the  one-dimensional  diffusion  equation 

+ G(x)  = 0 (2,6) 

where  is  the  diffusion  coefficient  for  electrons,  is  the  lifetime  of  excess  carriers, 
and  ripo  is  the  equilibrium  electron  density.  With  the  boundary  conditions  rip  = 0 
@x  — W and  rip  = ripo  @x  = oo,  we  can  solve  the  equation  to  get 

np  = ripo-  {ripo  + 


photon  flux 


(2.7) 
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where  = y/DnTn  and  C\ 


Now  the  diffusion  current  density  is 

1 - a^Ll 


given  by 


(2.8) 


Combining  equation  2.5  and  2.8,  we  obtain  the  total  current  density 


+ qnpo  — 

Jj' 


'n 


(2,9) 


As  a rule  of  thumb,  it  is  best  to  widen  the  depletion  region  to  absorb  more 
photons  [10].  For  a one-sided  silicon  pn  diode,  the  depletion  region  width  can  be 
described  by 


where  Ssi  is  the  permittivity  of  silicon,  ■0(,i  is  the  built-in  potential  of  a one-sided  pn 
diode,  Vh  is  the  externally  applied  reverse  bias  to  the  junction,  and  is  the  lightly 
doped  density,  either  the  donor  or  the  acceptor  concentration.  According  to  the 
above  equation,  the  depletion  region  can  be  widened  by  lowering  the  doping  density 
or  increasing  the  reverse  bias  to  the  junction. 

In  applications,  the  photodetector  response  is  characterized  by  quantum  effi- 
ciency (QE)  ?^(A),  defined  as  the  ratio  of  the  collected  charge  pairs  to  the  absorbed 
photons.  Ignoring  the  second  term  of  equation  2.9,  we  have 


Similar  to  the  incident  light  generated  photocurrent,  QE  is  also  a function  of  wave- 
length and  directly  related  to  the  process  parameters  via  and  W . 


(2,10) 


(2.11) 


In  a typical  CMOS  process,  three  types  of  vertical  P/N  diodes,  i.e.,  n-|-/psub, 
p-p/nwell  and  nwell/psub,  are  available.  The  reported  QEs  for  n-|-/psub  and  p-f/nwell 
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diodes  are  lower  than  that  for  nwell/psub,  which  is  mainly  due  to  two  reasons.  First, 
silicide  technology,  widely  used  over  the  n+/p+  regions  in  today’s  advanced  tech- 
nologies [11],  is  opaque  to  visible  light  and  degrades  the  light  sensitivity.  In  most 
technologies,  it  is  not  feasible  to  access  the  device  process  with  a silicide  block.  Sec- 
ond, a nwell/psub  diode  has  a wider  depletion  region  compared  to  the  other  two 
diodes  under  the  same  bias  condition,  thus  collecting  more  carriers. 

2.1.3  Dark  Current 

Dark  current  is  the  thermal  generated  photodiode  leakage  current,  which  does 
not  depend  on  the  incident  light  intensity.  It  comes  from  the  defects  either  at  the 
Si02-Si  surface  or  in  the  bulk.  Among  them,  the  high  irregularity  at  the  Si02-Si 
surface  is  the  principal  contributor  of  dark  current,  which  has  been  verified  by  the 
measured  data  in  Loukanova  et  al.  [12].  Dark  current  is  an  important  parameter  to 
characterize  the  performance  of  image  sensors  since  the  variations  of  dark  current 
from  pixel  to  pixel  will  contribute  to  the  fixed  pattern  noise  (FPN).  In  addition,  the 
shot  noise  associated  with  dark  current  is  part  of  the  total  temporal  noise,  which  could 
deteriorate  the  signal-to-noise  ratio  (SNR)  and  dynamic  range  (DR)  performance  of 
image  sensors.  So  it  is  preferred  to  design  image  sensors  with  a low  dark  current. 

Under  low-biased  conditions,  the  dark  (leakage)  current  of  a pn  junction  consists 
of  the  diffusion  current  from  the  quasi-neutral  areas  and  the  thermal  generation  cur- 
rent from  the  depletion  region.  Adopting  the  same  device  structure  described  in  the 
previous  section,  the  dark  current  density  can  be  expressed  as  [9] 


Jdark 


jdark  , jdark 
'^diff  “T  '^SRH 


Tn  Na 


+ q- 


rii 


■W 


T, 


gen 


(2.12) 


where  is  the  electron  diffusivity,  is  the  electron  life  time,  Na  is  the  doping  level 
of  p-type  region,  is  the  intrinsic  concentration  of  silicon,  r^en  is  the  generation 
lifetime,  and  W is  the  depletion  width.  The  foregoing  derivation  reveals  that  the 
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dark  current  is  a strong  function  of  the  depletion  region  width  and  the  temperature. 
The  temperature  dependence  mainly  comes  from  the  intrinsic  concentration  of  silicon, 
rii  [12],  Generally,  a high  temperature  results  in  a large  intrinsic  concentration,  thus 
a high  dark  current. 

Physically,  there  are  two  feasible  ways  to  reduce  the  dark  current,  i.e.,  narrowing 
the  depletion  region  or  decreasing  the  working  temperature.  Since  a narrow  depletion 
region  may  degrade  the  quantum  efficiency,  a low  working  temperature  becomes  the 
best  choice.  In  addition,  many  other  efforts  have  been  tried  to  lower  dark  current 
through  either  an  optimized  layout  approach  or  a clever  readout  circuit  design.  Since 
the  SiO^-Si  surface  defect  is  the  dominant  dark  current  source,  a small  dark  current 
can  be  achieved  if  the  photon-sensing  area  is  isolated  from  the  defective  field  oxide 
edge.  One  way  is  to  modify  the  process  with  a shallow  active  layer  above  the  photon- 
sensing area  to  form  a new  structure,  called  pinned  photodiode  [13].  Another  way  is 
to  employ  a reset  gate-poly  ring  surrounding  the  photodiode,  which  can  eliminate  the 
dark  current  originating  from  the  border  region  adjacent  to  the  defective  field  oxide 
edge  without  modifying  the  process  [14].  Aside  from  optimal  pixel  designs,  a novel 
readout  scheme  with  a combined  photogate/photodiode  photo-sensing  device  is  also 
able  to  achieve  an  ultra-low  dark  current.  Thanks  to  the  correlation  of  dark  currents 
in  neighboring  readout  signals,  the  dark  currents  can  be  somewhat  cancelled  in  the 
output  signal  by  differentiating  two  neighboring  readout  signals  [15]. 

2.2  Charge-Coupled  Devices  (CCDs) 

The  CCD  was  invented  in  1970.  This  dynamic  (charge)  shift  register  is  imple- 
mented using  closely  spaced  MOS  capacitors  with  2,  3 or  4 phase  clocks  (see  figure 
2-3).  CCDs  are  optimized  photodetectors,  whose  virtues  include  high  QE,  low  dark 
current,  low  noise  and  high  fill  factor. 

A CCD  imager  generally  adopts  a serial  readout  technique  as  illustrated  in  figure 
2-4.  The  analog  charge  must  be  shifted  out  of  the  chip  before  it  converts  into  a 
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Figure  2-3:  A three  phase  CCD  (after  Gamal  [6]) 
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Figure  2-4:  An  interline  transfer  CCD  image  sensor  (after  Gamal  [6]) 

digital  value.  It  requires,  on  average,  several  thousands  of  shifts  for  a large  CCD 
array.  Therefore  the  charge  transfer  efficiency  (CTE),  defined  as  the  fraction  of 
signal  charge  transferred  from  one  CCD  stage  to  the  next,  should  be  high  enough  to 
avoid  charge  loss.  To  understand  this  serious  issue,  let  us  consider  a CTE  given  by 
r]  = 0.999.  Then  the  net  fraction  of  signal  transferred  after  m = 1024  stages  is  only 
7]'^  = 0.359,  which  means  a great  amount  of  charge  has  been  lost  during  the  transfer 
process.  The  need  for  near-perfect  CTE  has  a great  impact  on  CCD  imagers.  In 
summary,  the  major  reported  limitations  of  CCDs  are  [6,  16]: 

1.  Limited  frame  rate  due  to  the  minimum  required  transfer  time  per  CCD  stage 
to  ensure  a perfect  CTE, 
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2.  Difficulty  to  achieve  a large  size  array  limited  by  the  CTE, 

3.  Requiring  complex  high  speed  shifting  clocks  for  proper  operations, 

4.  High  power  consumption  since  a relative  high  voltage  (up  to  15V)  is  required 
for  proper  operations  and  the  entire  array  are  switching  all  the  time, 

5.  Highly  nonprogrammable, 

6.  Inability  to  implement  system-on-chip  (SoC). 

Though  CCDs  still  dominate  in  the  commercial  products  market,  the  limitations 
have  driven  researchers  to  study  new  solutions  to  circumvent  the  major  weaknesses 
of  CCD  technology.  CMOS  image  sensors  emerge  at  this  request.  We  will  address 
this  technology  in  the  next  section. 

2.3  CMOS  Image  Sensors 

A CMOS  image  sensor,  fabricated  in  a standard  CMOS  process,  is  capable  of 
integrating  timing  and  control  electronics,  a sensor  array,  signal  processing  electronics, 
an  analog-to-digital  converter  (ADC)  and  a full  digital  interface  on  the  same  chip  [1]. 
It  operates  with  standard  logic  supply  voltages  and  consumes  little  power.  Recent 
advances  have  made  CMOS  image  sensors’  performance  competitive  with  CCDs. 
Most  CMOS  image  sensors  are  two-dimensional  addressable  arrays  as  shown  in  figure 
2-5.  After  integrating  photocurrent  for  a predefined  period,  the  analog  charge  is  read 
out  by  transferring  one  row  at  a time  to  the  column  sample- and-hold  circuits,  then 
reading  out  one  or  more  pixels  in  the  selected  row  at  a time  using  the  multiplexer. 
2.3.1  A Simple  Equivalent  Circuit  of  Photodiode 

As  discussed,  most  CMOS  image  sensors  operate  in  the  integration  mode,  and  the 
photocurrent  is  read  out  in  terms  of  integrated  charge.  Since  photodiodes  are  widely 
used  in  CMOS  image  sensors,  a simple  equivalent  circuit  of  a photodiode  is  needed  for 
the  purpose  of  analysis  and  simulation.  The  equivalent  circuit  model  of  a photodiode 
is  shown  in  figure  2-6.  It  is  made  up  of  a current  source  and  a capacitor  in  parallel. 
Once  it  is  reset,  the  capacitor  is  discharged  from  the  initial  reset  voltage  K-eset  by  the 
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Figure  2-5:  A typical  CMOS  image  sensor  architecture 


normalized  photocurrent  I{Vout)  given  in  equation  2.9.  Hence,  the  following  relation 
holds 

''Preset  f\/ 

^pd\  ^OUt) 


f 


dVout{t)  = 


dt 


(2.13) 


!(Vm) 

where  Opd(Hout)  = Cq-\-  Cd{Vout)-  Cq  is  the  peripheral  capacitance  normalized  by  the 

photodiode  junction  area  A,  and  Cd{Vout)  is  the  unit  depletion  capacitance  of  the 

s ■ 

photodiode  and  a function  of  Vout,  CdiVout)  = 777777 — 7-  H is  very  difhcult  to  derive 

^d[Vout) 

a closed  form  solution  for  the  above  equation.  Here  we  only  provide  the  numerical 
simulation  results  for  a nwell/psub  diode  with  Vj-eset  = 3.3V  in  figure  2-7  and  figure 
2-8. 


Two  scenarios  are  considered  here.  First,  the  peripheral  capacitance  Cq  is  neg- 
ligible. Second,  Cq  is  much  larger  than  Cd-  Surprisingly,  both  cases  demonstrate 
good  linearity  for  integrated  charge  with  respect  to  incident  light  intensity  and  drop- 
ping output  voltage  with  respect  to  integration  time.  So  the  photodiode  model  can 
be  simplified  to  a constant  capacitor  Cpd  and  a constant  photocurrent  source  I in 
parallel. 
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Figure  2-6:  An  equivalent  model  of  photodiode 


2.3.2  Pixel  Circuits 

Conventional  CMOS  image  sensors  can  be  divided  into  two  categories:  passive 
pixel  sensors  (PPS)  and  active  pixel  sensors  (APS).  The  PPS  concept  is  shown  in 
figure  2-9,  which  consists  of  a photodiode  and  an  access  switch.  When  the  photodiode 
is  accessed,  the  integrated  charge  is  converted  to  a voltage  by  a charge  integrating 
amplifier  located  at  the  bottom  of  the  column  bus.  The  large  capacitive  load  causes 
very  slow  readout  speed.  Fortunately,  with  the  insertion  of  a buffer  into  the  pixel, 
an  APS  sensor  can  efficiently  solve  the  readout  speed  problem  with  PPS.  Figure 
2-10  shows  a typical  active  pixel  schematic.  In  steady  state  by  assuming  charge  Q 
accumulated  on  the  photodiode  capacitance  C^d  at  the  end  of  integration  and  ignoring 
the  voltage  drop  across  the  access  transistor  and  body  effect,  we  obtain  the  output 
voltage 


where  Vth  is  the  source  follower  threshold  voltage,  Cox  is  the  unit  oxide  capacitance, 
is  the  electron  mobility,  and  Wp/Lp  is  the  source  follower  transistor  size.  Obviously, 
the  readout  voltage  Vg  directly  reflects  the  integrated  photosignal. 


(2.14) 


Photo  Charge  (C/cm') 
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Figure  2-7:  Simulated  photodiode  response  when  Cq  is  negligible,  (a)  integrated 
charge  vs.  incident  light  intensity  for  wavelenth=500nm.  (b)  output  voltage  vs.  inte- 
gration time  for  Fq  = 3.2  x photons / cm? s at  the  surface. 

2.4  Noise  in  Image  Sensors 

Like  other  electronic  products,  the  overall  performance  of  an  image  sensor  is 
ultimately  determined  by  the  noise  introduced  by  the  system  into  the  signal.  Noise 
in  image  sensors  is  typically  divided  into  temporal  noise  and  fixed  pattern  noise.  We 
will  address  these  noise  sources  separately  in  the  following. 

2.4.1  Temporal  Noise 

Temporal  noise  is  the  time-dependant  fluctuations  in  the  signal  level  due  to 
device  noise.  It  can  be  introduced  from  the  pixel,  the  readout  circuit,  the  substrate 
and  the  power  supply.  However,  we  will  not  include  the  circuit-oriented  temporal 
noise  originating  from  the  substrate  coupling  or  the  power  supply  oscillation  in  the 
following  discussion,  because  they  are  considered  to  be  negligible  compared  to  other 
noise  sources. 


1.  Pixel  Photon  Shot  Noise 
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Figure  2-8:  Simulated  photodiode  response  when  Co  is  much  larger  than  C^.  (a)  in- 
tegrated charge  vs.  incident  light  intensity  for  wavelenth=500nm.  (b)  output  voltage 
vs.  integration  time  for  Fq  = 1.96  x photons / cm? s at  the  surface. 

Word 


Figure  2-9:  Passive  pixel  schematic 


Photon  shot  noise^  is  essentially  the  result  of  the  random  generation  of  carriers 
and  obeys  the  Poisson  statistics.  It  is  generated  either  by  the  thermal  generation 
within  a depletion  region  or  by  the  random  arrival  of  photons.  The  noise  is 
expressed  as 

(electrons)'^  (2-15) 


'^photon 


Q photon 


^phtint 


^ In  this  dissertation,  we  also  call  it  photocurrent  shot  noise. 
^ In  this  dissertation,  we  also  use  e“  to  represent  electron. 
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Figure  2-10:  Active  pixel  schematic 


where  Qphoton  is  the  integrated  charge  introduced  by  photons,  iph  is  the  gener- 
ated photocurrent,  and  tint  is  the  integration  time. 

2.  Pixel  Dark  Current  Shot  Noise 

Same  as  the  photon  shot  noise,  the  thermal  generation  of  charge  carriers  under 
dark  conditions  is  also  a Poisson  process.  If  Qdark  is  the  integrated  charge  due 
to  a dark  current  idark  within  an  integration  period  of  Unt,  the  dark  current  shot 
noise  is  given  by 


3.  Pixel  Reset  (kT/C)  Noise 

Introduced  by  the  channel  thermal  noise  of  the  reset  transistor  (i.e..  Ml  in 
figure  2-10),  the  reset  noise  on  the  photodiode  is  given  by^  [17] 


^ By  using  a NMOS  as  the  reset  transistor,  the  reset  noise  power  is  if 

the  steady  state  is  not  reached  during  reset. 


(electrons) 


(2.16) 


(2.17) 
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where  k = 1.38  x is  the  Boltzmann  constant,  and  T is  the  temperature 

in  Kelvin. 

4.  In-Pixel  MOS  Device  Noise 

The  in-pixel  ampliher  noise  primarily  comes  from  the  thermal  and  flicker  noise  of 
MOS  transistors.  They  may  be  suppressed  by  a DC  offset  cancellation  technique 
or  limiting  the  bandwidth  of  the  ampliher. 

5.  Readout  Noise 

The  MOS  transistors  in  the  column-level  or  chip-level  readout  circuit  will  in- 
evitably introduce  thermal  noise  and  dicker  noise,  which  are  independent  of 
photocurrent,  dark  current  and  integration  time. 

2.4.2  Fixed  Pattern  Noise  (FPN) 

Fixed  pattern  noise  is  a non-temporal  spatial  noise,  which  is  due  to  device  mis- 
matches in  pixels  and  column-level  circuits.  The  major  components  include: 

1.  Dark  current  FPN  due  to  the  mismatch  in  photodiode  leakage  currents. 

2.  Pixel  response  FPN  due  to  the  nonuniformities  of  geometry,  layer  thickness  or 
doping  prohle. 

3.  Readout  circuit  FPN  mainly  due  to  the  threshold  voltage  variations  between 
MOSFETS. 

With  the  application  of  a double-delta-sampling  (DDS)  circuit  [3],  the  readout 
circuit  FPN  can  be  suppressed  to  a negligible  level  compared  to  the  photocurrent 
FPNs,  i.e.,  the  dark  current  FPN  and  the  pixel  response  FPN.  Under  this  condition, 
the  FPN  is  proved  to  be  dominated  by  the  dark  current  FPN  at  low  signal  levels  and 
the  pixel  response  FPN  at  high  signal  levels  [18]. 

2.5  Discussion 

The  previous  derivations  of  photocurrent  and  quantum  efficiency  are  very  simple 
but  instructive.  A more  explicit  analysis  may  include  the  diffusion  current  in  the 
quasi-n  region,  the  reflection  at  the  surface  of  the  chip,  the  reflections  and  absorptions 
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in  the  layers  above  the  photodetectors  and  the  variation  of  Jph  over  the  photodetector 
area  (edge  effects)  [6].  Though  the  physics  of  photodetectors  reviewed  in  this  chapter 
is  quite  fundamental,  a good  understanding  is  beneficial  to  our  system  design. 


CHAPTER  3 

REVIEW  OF  HIGH  DYNAMIC  RANGE  CMOS  IMAGERS 
A typical  CMOS  APS  has  a dynamic  range  of  65-75dB  [19].  To  improve  the 
dynamic  range,  two  approaches  are  considered.  One  is  to  reduce  the  noise  floor  and 
extend  the  dynamic  range  towards  the  weak  signal  region.  The  other  is  to  expand 
the  nonsaturating  signal  level.  In  this  dissertation,  we  only  concentrate  on  the  latter. 
This  chapter  starts  with  the  analysis  of  the  dynamic  range  problem  with  conventional 
CMOS  APSs,  followed  by  a review  of  existing  dynamic  range  enhancement  methods. 
Finally,  our  novel  biologically  inspired  time-to-first-spike  imager  is  presented. 

3.1  DR  Problem  with  Conventional  CMOS  APSs 
In  this  section,  we  take  the  most  commonly  used  photodiode  APS  as  an  example 
to  show  the  DR  limitation.  For  an  APS  imager  shown  in  figure  3-1,  if  M4  operates  in 
the  strong  inversion  region  and  saturates,  the  output  voltage  in  terms  of  the  integrated 
charge  is 

Vo  = Vaa--^-  VtH,F  - (3.1) 

where  Vth,F  is  the  source  follower  threshold  voltage  without  considering  body  effect, 
Vth,B  is  the  bias  transistor  threshold  voltage,  and  Wf/Lfi  are  the  sizes  of  M2 

and  M4  respectively.  The  maximum  output  voltage  occurs  when  Q = 0,  which  is 
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whereas  the  minimum  output  voltage  Romm  is  decided  by  the  operation  of  the  bias 


transistor  M4.  To  ensure  that  M4  works  in  saturation,  Vomin  must  be 
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Figure  3-1:  DR  limitation  of  CMOS  APS 
Usually,  we  use  well  capacity  instead  of  signal  swing  to  evaluate  the  performance 
of  CMOS  image  sensors.  For  a CMOS  APS,  the  effective  well  capacity,  Qeff  = 
Us  X Cpd,  is  much  less  than  the  available  maximum  well  capacity  Qmax  = ^dd  x Cpd, 
since  the  output  voltage  reaches  its  minimum  value  before  the  diode  voltage  drops  to 
ground.  If  the  dark  current  is  negligible  compared  to  the  regular  photocurrent,  the 
largest  nonsaturating  signal  can  be  computed  as  imax  = Qeff /tint  for  an  integration 
period  of  tint-  Generally,  the  smallest  detectable  input  signal  is  defined  as  the  input 
referred  noise  floor  under  dark  conditions,  which  gives 


*7  / ^darktint 


+ af 
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tint  V Q 

Here,  idark  is  the  dark  current,  and  cr^  {electrons'^)  includes  the  readout  noise  and 
the  reset  noise.  As  the  ratio  of  imax  to  imin,  the  dynamic  range  equals 
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Equation  3.6  is  plotted  in  figure  3~2  for  a sensor  with  Vg  = IV,  Cpd  = lOfF,  = 
20e~,  and  idark  = 1 /^-  As  the  integration  time  increases,  the  increased  dark  current 
shot  noise  increases  the  noise  floor  monotonically  and  reduces  the  dynamic  range. 
Note  that  whatever  the  integration  time  is,  the  maximum  achievable  DR  is  less  than 
70dB  as  a result  of  limited  well  capacity. 


Figure  3-2:  DR  vs.  integration  time 


The  achievable  dynamic  range  will  be  further  deteriorated  with  CMOS  tech- 
nology continuously  scaling  down  because  of  the  reduced  signal  headroom  and  the 
increased  noise  floor.  Figure  3-3  illustrates  this  trend  with  Unt  = 33ms,  Cpd  = 10/F, 
ar  = 20e~,  and  idark  = 1/A. 

A typical  outdoor  scene  has  a DR  of  more  than  lOOdB,  which  is  far  beyond 
conventional  CMOS  APSs’  DR.  In  order  to  deal  with  such  high  dynamic  range  scenes, 
special  designs  are  demanded.  Current  dynamic  range  enhancement  methods  will  be 
addressed  in  the  next  section. 
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Figure  3-3:  DR  vs.  signal  swing 


3.2  Existing  Dynamic  Range  Enhancement  Methods 
3.2.1  Analog  Domain  Methods 

All  the  methods  in  this  category  represent  illuminance  in  the  form  of  an  analog 
signal,  e.g.,  current,  voltage  or  integrated  charge.  Another  representation  of  illumi- 
nance will  be  discussed  later. 

Nonlinear  Signal  Compression  (NSC) 

As  explained  earlier,  the  limited  output  voltage  swing  and  the  linear  mapping  be- 
tween output  voltage  and  the  incident  light  generated  photocurrent  limit  the  achiev- 
able DR  to  only  70dB.  Since  the  voltage  supply  cannot  be  significantly  increased, 
many  researchers  have  tried  to  use  nonlinear  response  curves  to  improve  the  dynamic 
range. 

If  the  photocurrent  is  fed  into  a diode-connected  MOS  transistor  in  the  sub- 
threshold region,  the  output  voltage  will  exhibit  a logarithmic  response  of  the  input 
current.  This  nonlinear  response  voltage  can  cover  about  five  orders  of  the  magnitude 
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of  the  incoming  light  intensity  or  lOOdB  dynamic  range.  This  technique  has  been  suc- 
cessfully implemented  by  a research  group  at  Caltech  [20].  However,  low  contrast, 
loss  of  details,  low  SNR  and  high  nonlinear  FPN  make  it  very  problematic  [19]. 

Compared  with  the  logarithmic  response  imager,  a well  capacity  adjustment 
technique  has  better  control  over  the  FPN.  It  uses  a lateral  overflow  gate  to  compress 
the  sensor’s  current  versus  charge  response  curve.  A correlated  double  sampling 
(CDS)  technique  is  adopted  to  reduce  the  FPN.  The  reported  dynamic  range  is  96dB 
and  the  FPN  is  only  0.24%  of  saturation  level  [21].  However,  this  technique  exhibits 
worse  SNR  performance  [22]. 

Multiple  Sampling  (MS) 

Conventional  CMOS  APSs  have  only  one  predefined  integration  time  tint-  A long 
tint  favors  weak  signals  by  integrating  much  more  charge  but  easily  saturates  strong 
signals,  whereas  a short  Unt  can  prevent  signals  from  saturating  but  degrades  the 
SNR  of  weak  signals.  No  single  integration  time  can  satisfy  both  dynamic  range  and 
SNR  requirements.  To  solve  this  problem,  a multiple  sampling  technique  has  been 
proposed,  which  uses  shorter  exposure  times  to  capture  the  brighter  parts  of  the  scene 
and  longer  exposure  times  to  capture  the  darker  regions  [22,  23].  The  reported  CMOS 
image  sensors  have  shown  the  expected  dynamic  range  and  offered  good  image  quality 
as  well.  This  type  of  image  sensor  is  very  promising  but  fundamentally  consumes  more 
power  and  requires  a large  data  bandwidth  to  readout  the  multiple  frames. 

3.2.2  Timed-Based  Methods 

Instead  of  choosing  a single  integration  time  as  in  conventional  CMOS  APSs, 
time-based  image  sensors  allow  each  pixel  to  choose  its  own  optimal  integration  time 
with  respect  to  the  illuminance.  In  this  way,  illuminance  is  encoded  with  temporal 
information  much  like  neural  coding  in  a biological  vision  system.  Basically,  time- 
based  image  sensors  operate  in  two  modes:  first,  asynchronous  mode  and  second, 
synchronous  mode. 
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Asynchronous  Mode  (TEAM) 

The  time- based  CMOS  image  sensors  working  in  the  asynchronous  mode  have  no 
global  reset  signal  for  all  pixels.  Instead,  each  pixel  works  as  a free-running  continuous 
oscillator  and  the  whole  system  is  essentially  a frequency  modulator  in  that  high 
frequency  pulses  represent  bright  illuminance  and  low  frequency  pulses  represent  dark 
illuminance.  Many  different  readout  architectures  have  been  implemented  to  sample 
the  pulses.  In  the  first  time-based  image  sensor,  an  asynchronous  counter  on  each 
row  counts  the  output  pulses  for  a fixed  time  period  [24].  The  short-term  average 
pulse  frequency  is  used  to  reconstruct  the  illuminance.  A dynamic  range  of  120dB 
was  achieved  for  a static  scene. 

By  noticing  the  equivalence  between  a synchronous  first-order  S-A  modulator 
and  a sampled  asynchronously  running  oscillator,  Mcllrath  [25]  uses  the  binary  output 
stream  from  the  sampled  oscillator  to  reconstruct  the  scene.  The  author  claims  that 
by  sweeping  through  a set  of  binary  weighted  frequencies,  fg  = fo,  /o/2,  . . . , fo/2^, 
only  8k  samples  need  to  be  taken  to  give  a dynamic  range  of  6/c  4-25  dB. 

The  address-event  circuit,  originally  developed  to  communicate  spike  trains  be- 
tween arrays  of  silicon  neurons  on  multiple  chips  [26],  can  be  applied  in  an  imager. 
It  works  as  follows:  when  a pixel  reaches  a threshold,  a request  (spike)  for  access  to 
the  output  bus  is  sent  out  to  the  address-event  circuit.  Once  the  request  is  approved, 
the  address  of  the  pixel  is  readout.  The  average  interspike  interval  is  measured  to 
represent  the  light  intensity  information  [27]. 

Although  the  reconstruction  methods  vary,  all  the  asynchronous  mode  time- 
based  imagers  in  the  literature  have  to  readout  a large  amount  of  redundant  informa- 
tion, which  implies  more  power  consumption,  larger  data  bandwidth  and  more  frame 
memory.  One  serious  problem  with  these  designs  is  that  a long  frame  time  may  be 
needed  to  collect  all  useful  information  for  the  scene  recovery,  which  is  obviously  not 
feasible  for  dynamic  scenes  or  video  mode  applications. 
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Synchronous  Mode  (TBSM) 

For  synchronous  mode  time-based  CMOS  image  sensors,  there  is  a global  reset 
signal  in  each  imager.  After  a photodetector  is  reset,  the  voltage  on  the  photodetector 
linearly  decreases.  Once  the  voltage  drops  below  the  threshold  voltage,  a time  stamp 
is  stored  on  a capacitor  inside  each  pixel  [28,  29,  30].  Since  the  eventual  readout 
signal  is  an  analog  signal,  it  still  needs  an  external  ADC  to  obtain  digital  values. 

In  [31],  a time  domain  sampling  method  is  proposed.  2^  samples  are  required  to 
get  k bits  of  resolution,  which  means  requiring  much  more  memory  and  more  power 
consumption. 

3.3  Time-to-First-Spike  Imager 

Recently,  some  biologists  have  claimed  that  the  most  useful  information  from  the 
retina  is  contained  in  the  first  spike  after  onset  of  the  neuronal  response.  Based  on 
psychophysical  experiments,  they  have  determined  that  reactions  times  are  so  short 
that  there  is  no  enough  time  to  process  more  than  one  spike  from  each  neuron  per 
processing  stage  [32,  33].  Inspired  by  both  the  biology  and  engineering  constraints, 
we  propose  a novel  imager,  a time-to-hrst-spike  (TTFS)  imager,  which  represents  the 
illuminance  in  a different  way.  To  overcome  the  shortcomings  of  the  previous  time- 
based  designs,  the  illuminance  is  transformed  into  a pulse  event  that  can  only  occur 
once  per  pixel  per  frame  in  the  time  domain.  With  respect  to  the  imager’s  global  (or 
row-level)  reset  signal  that  occurs  at  the  start  of  each  frame,  a brighter  pixel’s  pulse 
event  occurs  before  a darker  pixel’s  pulse  event.  The  spike  readout  circuits  of  the 
TTFS  imager  operate  asynchronously  like  [27].  The  collected  temporal  information 
of  output  addresses  can  be  used  to  reconstruct  the  scene  while  maintaining  a wide 
dynamic  range  and  performing  smart  functions,  such  as  histogram  equalization  and 
scene  segmentation.  We  will  explicitly  discuss  this  novel  imager  in  the  following 


sections. 
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3.3.1  Principle 

A time-based  image  sensor  essentially  tries  to  extend  the  dynamic  range  on  the 
high  end,  limited  by  the  power  supply  for  typical  APSs.  Now  considering  when  the 
photocurrent  discharges  the  pixel  capacitance,  the  following  relation  holds 

AP 

Iph  = Cpd~^  (3-7) 

where  AV  is  the  measured  pixel  output  voltage  or  signal  swing,  Cpd  is  the  pixel 
capacitance  (assumed  to  be  constant),  Atint  is  the  integration  time  for  each  pixel, 
and  Iph  is  the  photocurrent  for  each  pixel  (assumed  to  be  constant  in  each  integration 
period).  As  discussed  previously,  typical  APSs  choose  a fixed  integration  time,  and 
the  signal  swing  limits  the  typical  dynamic  range  to  about  70dB.  On  the  contrary, 
the  output  of  each  pixel  in  time-based  imagers  is  not  a voltage  but  the  time  Atint 
taken  to  discharge  the  pixel  capacitance.  The  system  is  no  longer  forced  to  choose  a 
single  integration  time,  since  each  pixel  naturally  chooses  a suitable  integration  time. 
For  still  image  mode,  the  dynamic  range  can  be  significantly  enhanced  when  this 
temporal  coding  is  used. 

The  TTFS  also  uses  temporal  coding.  Instead  of  reading  out  the  analog  voltage 
across  each  photodiode  at  a predetermined  exposure  time,  there  is  a comparator  inside 
each  pixel.  When  the  voltage  on  a photodiode  drops  below  a global  reference  voltage 
Vref,  the  comparator  inverts,  and  the  pixel  generates  a pulse  (i.e.,  it  has  “fired”).  As 
shown  in  figure  3-4,  three  pixels  are  located  at  the  addresses  {rowl,  coll),  {row2, 
col2)  and  {row3,  col3),  with  the  photocurrents  of  II,  12  and  13.  Once  one  pixel  has 
fired,  it  is  disabled  for  the  rest  of  the  frame  after  its  address  is  output.  The  time 
at  which  a pixel’s  address  is  read  out  represents  the  photocurrent  (illuminance)  of 
the  pixel.  For  this  scheme,  an  ADC,  which  is  required  for  conventional  imagers  to 
output  digital  values,  can  be  replaced  with  a simple  digital  counter  that  records  the 
time  when  each  pixel  fires.  This  scheme  is  in  fact  similar  to  a single-slope  ADC.  In 
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Figure  3-4:  Scheme  for  the  TTFS  imager  in  still  image  mode 


order  to  reconstruct  an  image,  the  firing  time  of  each  pixel  will  likely  be  quantized 
and  stored  in  an  external  receiver  (PC  or  DSP  processor).  This  time  quantization 
occurs  after  the  information  is  scanned  off  the  imager  chip.  Since  the  off-chip  digital 
clock  can  run  very  fast,  the  amount  of  quantization  noise  is  small  relative  to  the  other 
delays  due  to  buffering  and  collisions. 

Since  the  signal  is  sampled  in  the  time  domain,  the  dynamic  range  can  be  ex- 
pressed as  the  ratio  of  the  longest  and  shortest  integration  (firing)  times 

DR  = 20  log  = 20  log  (3.8) 

^min  ^shortest 

Usually,  the  dark  current  limits  the  longest  possible  pixel  integration  time  (firing) 
time,  which,  from  our  measurement,  is  about  4.5s  for  the  TTFS  imager  fabricated  in 
TSMC  0.18/rm  digital  technology.  The  shortest  exposure  time  for  a single  pixel  is  de- 
signed to  be  Ips,  which  gives  a dynamic  range  of  more  than  130dB  (20  log(4.5s/l/.<s)  = 
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133dB  ).  For  a large  size  array,  the  shortest  exposure  time  is  expected  to  be  10/rs, 
which  yields  a 113dB  dynamic  range. 
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Figure  3-5:  Scheme  for  the  TTFS  imager  in  video  mode 


For  video  mode  applications,  the  maximum  achievable  integration  time  is  lim- 
ited by  the  video  frame  time,  which  is  typically  about  33ms.  For  this  reason,  current 
CMOS  processes  limit  time-based  (including  TTFS)  imagers  to  around  70dB  of  dy- 
namic range  which  is  about  the  same  as  for  conventional  CMOS  imagers  [34].  How- 
ever, the  TTFS  imager  can  dramatically  increase  the  DR  in  video  mode  by  slowly 
increasing  the  reference  voltage  from  the  lowest  value  to  the  reset  voltage  during 
a frame.  All  pixels  are  guaranteed  to  pass  the  threshold  and  fire  within  the  fixed 
frame  time.  Existing  time-based  imagers  where  the  pixels  are  run  as  asynchronous 
oscillators  are  not  able  to  take  advantage  of  this  feature. 
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The  extended  DR  due  to  a varying  voltage  reference  can  be  illustrated  intuitively 
as  follows.  Taking  the  log  of  each  side  of  equation  3.7  gives 

log  Iph  = log  Cpd  + log  AR  - log  Atint  (3.9) 

The  lefthand  side  is  the  pixel  illuminance  and  its  DR  must  be  captured  in  the  sum  of 
the  DR  of  the  components  on  the  righthand  side.  Since  Cpd  is  constant,  the  DR  can 
only  be  captured  by  the  AV  and  Atint  terms.  For  conventional  CMOS  imagers,  the 
integration  time  Atint  is  fixed,  and  all  of  the  DR  must  be  encoded  in  AV  resulting 
in  the  usual  65-75dB  limitation.  Typical  time-based  imagers  hold  AV  constant  and 
all  of  the  DR  must  be  encoded  in  At^nt  resulting  in  the  65-75dB  limitation  given 
above  for  video  mode.  For  the  varying  Vref  case,  the  full  DR  is  the  sum  of  the  DR  of 
both  Atint  and  AV  producing  a 130-150dB  dynamic  range  for  video  mode,  which  is 
illustrated  in  figure  3-5.  Obviously,  the  varying  reference  voltage  favors  the  dynamic 
range  extension  for  video  mode  applications.  It,  however,  degrades  the  captured 
scene’s  SNR,  which  will  be  discussed  later. 

3.3.2  System  Architecture 

The  original  TTFS  image  sensor  architecture  is  shown  in  figure  3-6.  To  differ- 
entiate it  from  the  modified  architectures,  we  call  it  the  TTFS_classic  imager.  Inside 
each  pixel,  there  is  a photodiode,  a comparator  and  a digital  control  circuitry.  To 
minimize  the  FPN,  an  autozeroing  technique  is  adopted  to  reset  the  pixel. 

The  TTFS-classic  imager  operates  as  follows: 

1.  After  reset,  the  photocurrent  discharges  the  photodiode.  When  the  voltage 
across  photodiode  drops  below  the  comparator’s  reference  voltage,  the  pixel 
makes  a row  request  by  pulling  down  the  ^row-request{m)  line. 

2.  The  row  arbiter  arbitrarily  selects  a row  from  the  requesting  group  by  mak- 
ing row  select  {m)  high.  Subsequently,  this  row’s  address  is  stored  by  the  row 
address  encoder. 
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Figure  3-6:  TTFS_classic  imager  block  diagram 

3.  When  the  control  signal  rowselectim)  is  received,  all  firing  pixels  in  that  row 
are  allowed  to  make  column  requests  by  pulling  down  the  ^col -request {n)  lines. 

4.  The  column  latch  records  all  column  requests  of  firing  pixels  in  the  selected  row. 

5.  Once  all  column  requests  are  latched,  the  pixels  in  the  selected  row  that  had 
been  making  column  requests  are  disabled  from  firing  again  for  the  rest  of  the 
frame  by  disable{m)  and  coLreq{n)  via  the  digital  control  block.  As  a result, 
'^row-request{m)  from  this  row  is  withdrawn.  After  that,  if  there  are  other 
valid  '^row -request {m)  signals,  new  row  arbitration  is  allowed  to  start  taking 
place.  However,  the  row  interface  circuit  blocks  a new  row  select  {m)  to  generate 
until  all  valid  data  inside  latch  cells  have  been  processed.  Column  arbitration 
begins  on  the  requests  in  the  column  latch  cells.  Note  that  the  throughput 
control  circuit  is  used  to  control  the  column  arbitration  speed. 

6.  When  a column  is  selected  by  the  column  arbiter,  its  column  address,  the 
latched  row  address  and  the  time  information  are  all  read  out.  The  time  stamp 
represents  the  illuminance  information. 

7.  Once  the  column  arbitration  is  complete,  i.e.,  all  valid  data  inside  the  latch  have 
been  processed,  the  row  interface  circuit  allows  a new  rowselect{m)  signal  to 
become  valid.  At  this  point,  a readout  cycle  is  finished. 
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3.3.3  SNR  Consideration 

SNR  is  another  performance  criterion,  which  is  defined  as  the  ratio  of  the  input 
signal  power  and  the  average  input  referred  noise  power.  In  [35] , a simplified  integra- 
tion mode  pixel  model  is  depicted,  where  both  signal  and  noise  are  expressed  in  the 
form  of  charge.  The  additive  noise  has  two  independent  components:  (i)  shot  noise 
with  zero  mean  and  variance  of  q{iph  + 'i-dark)'t,  where  t is  the  integration  time,  iph  and 
idark  are  the  photocurrent  and  dark  current  respectively;  (ii)  readout  noise  cr^.  Thus 
the  SNR  can  be  calculated  as 


SNR{ipn) 


qi^ph  T '^dark')i  T 


(3.10) 


A more  complex  noise  model  may  include  the  FPN.  For  simplicity,  we  neglect  this 
noise  source  here  since  the  FPN  can  be  reduced  to  a negligible  value  by  using  either 
a correlated  double  sampling  or  an  autozeroing  reset  technique.  From  the  above 
equation,  we  notice  that  SNR  is  also  a function  of  integration  time.  Simulated  SNR 
with  respect  to  integration  time  is  plotted  in  figure  3~7.  Note  that  SNR  monotonically 
increases  with  integration  time,  and  ultimately  reaches  the  upper  bound  due  to  well 
saturation.  If  not  saturated,  larger  signals  enjoy  better  SNR  as  expected. 

Since  each  pixel  in  a TTFS  imager  can  reach  its  full  well  capacity  Qweii  in  still 
mode  applications,  the  SNR  is  maximized  as  shown  in  figure  3-8.  This  conclusion 
is  valid  for  most  time-based  image  sensors,  since  they  all  equivalently  use  a constant 
reference  voltage  scheme. 

In  video-mode  TTFS  image  sensors,  pixels  with  weak  photocurrents  cannot  in- 
tegrate for  too  long  to  reach  their  well  capacities.  Now  consider  the  case  that  the 
varying  reference  voltage  is  piece-wise  linear  as  shown  in  figure  3-5.  A pixel  with 
a strong  photocurrent  such  that  the  firing  time  is  less  than  Tq  can  achieve  full  well 
capacity,  whereas  a pixel  with  a weaker  photocurrent,  for  instance  the  pixel  {row3, 
colS)  in  figure  3-5,  does  not.  If  ignoring  dark  current,  the  firing  time  for  a weak 
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Figure  3-7:  SNR  vs.  integration  time 


signal,  e.g.,  iph  < Qweii/To,  is 


t = 


{Qwell  + kTo) 


where  k is  the  slope  of  the  linearly  increasing  part  of  the  reference  voltage,  k = 
Then  the  specific  expression  of  SNR  is  given  as  following 


(3.11) 

Qwell 

frame  ~ ^0 


SNR{iph)  = 


Qwell  T kTQ 
iph  + k 

Qwell  T kT()  2 
iph ^ q + C^r 


id  <iph<^' 


To 


^ph  ^ 


(3.12) 


well 


. qQwell  + <Tr 

Figure  3-8  shows  the  SNR  comparison  of  TTFS  imagers  with  different  reference 
voltage  schemes.  The  comparison  was  made  by  assuming  a well  capacity  of  110,000 
electrons,  a readout  noise  of  20  electrons  and  a dark  current  of  IfA.  It  indicates 
that  the  imagers  with  constant  reference  voltage  enjoy  better  averaged  SNR,  since 
weak  signals  have  taken  advantage  of  full  well  capacity.  Even  though  the  varying 
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Figure  3-8:  SNR  of  TTFS  imagers 


reference  voltage  scheme  favors  dynamic  range  extension  for  video  mode  applications, 
it  degrades  the  captured  scene’s  average  SNR,  especially  for  the  dark  region.  It  turns 
out  that  To  in  figure  3-5  is  a key  factor  affecting  the  SNR  performance.  A simple 
optimal  strategy  for  piece-wise  linear  reference  voltage  variation  will  be  presented  in 
the  next  chapter. 

For  video  mode  applications,  we  investigate  the  SNRs  for  the  multiple  sampling 
technique  and  the  time-based  method.  The  simulation  for  multiple  sampling  was  done 
for  a sensor  with  well  capacity  of  110,000  electrons,  readout  noise  of  20  electrons,  dark 
current  of  IfA  and  9 captures  at  32ms/2®,  32ms/2^,  ■ • • , 32ms.  The  simulation  of 
the  time-based  imager  is  performed  for  a sensor  with  the  same  well  capacity,  readout 
noise  and  dark  current,  except  for  a frame  time  of  32ms  and  a reference  voltage 
change  point  at  25ms.  The  simulation  results  are  given  in  figure  3-9.  We  observe 
that  the  time-based  imager  benefits  the  SNR  in  the  strong  signal  region.  The  SNR 
of  the  multiple  sampling  technique  shows  numerous  3dB  dips  since  each  pixel  cannot 
take  advantage  of  the  full  well  capacity  [35].  It  should  be  pointed  out  here,  for  a 
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Figure  3-9:  SNR  comparison  of  multiple  sampling  imagers  and  TTFS  imagers 

multiple  sampling  imager  with  APS  architecture,  the  SNR  is  about  2-3dB  worse  than 
the  simulated  results  due  to  the  narrowed  signal  swing  as  discussed  in  [36]. 

3.4  Summary  and  Discussion 

The  dynamic  range  limitation  of  conventional  CMOS  APSs  has  been  discussed 
in  this  chapter.  Several  existing  dynamic  range  enhancement  methods  were  reviewed. 
A novel  time-to-first-spike  CMOS  imager  was  introduced  in  detail,  which  has  demon- 
strated superior  performances,  e.g.,  DR  and  SNR,  compared  to  conventional  CMOS 
APSs.  In  addition,  we  summarize  the  comparison  between  the  TTFS_classic  imager 
and  existing  high  dynamic  enhancement  methods  in  table  3-1.  One  unique  issue  with 
the  TTFS-classic  imager  is  the  situation  when  many  pixels  send  out  request  signals 
within  a short  period.  The  asynchronous  readout  circuit  has  to  deal  with  this  col- 
lision problem,  and  some  amount  of  delay  will  be  inevitably  introduced  resulting  in 
some  temporal  errors  for  the  reconstructed  illumination.  We  will  address  this  issue 
and  propose  some  architectures  to  solve  this  problem  in  Chapter  6. 
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Table  3-1:  Comparison  of  the  TTFSxlassic  imager  and  existing  high  dynamic  range 
enhancement  methods 


Sensor  Type 

SNR 

Power 

Memory 

Video 

Collision 

NSC 

Low 

Low 

Few 

Yes 

No 

MS 

Good 

High 

More 

Yes 

No 

TEAM 

Good 

High 

More 

No 

No 

TBSM 

Good 

High 

Few/More 

Yes 

No 

TTFS_classic 

Good 

Low 

Few 

Yes 

Yes 

The  previous  discussed  dynamic  range  enhancement  methods  all  attempt  to  ex- 
tend the  dynamic  range  towards  the  strong  signal  region.  As  the  other  choice  of  DR 
enhancement,  the  extension  over  the  weak  signal  region  can  be  achieved  by  either  a 
dark  current  reduction  approach  [15]  or  a statistic  signal  processing  method  [37], 


CHAPTER  4 

AN  OPTIMAL  TWO  SEGMENT  PIECE- WISE  LINEAR 
STRATEGY  FOR  REFERENCE  VOLTAGE  VARIATION 

For  video  mode  applications,  the  maximum  integration  time  is  limited  by  the 
frame  time,  which  is  typically  33ms.  Dynamic  range  would  be  limited  if  using  a con- 
stant reference  voltage,  as  discussed  in  the  previous  chapter.  By  varying  the  reference 
voltage,  all  the  useful  illuminance  information  can  be  collected  within  each  frame  pe- 
riod. So  far,  there  has  not  been  any  systematic  study  of  what  is  the  optimal  strategy 
for  the  reference  voltage  variation.  T.  Chen  [38]  has  studied  an  optimal  scheduling 
of  capture  times  for  the  multiple  sampling  method  by  assuming  the  complete  inci- 
dent illumination  probability  density  function  (pdf)  is  known  in  advance.  In  [31], 
a sampling  strategy  has  been  discussed  for  a time-based  imager  with  synchronized 
readout  scheme  by  considering  the  shortest  sampling  interval.  These  two  papers  both 
try  to  acquire  an  optimal  solution  with  the  objective  of  achieving  maximum  average 
SNR  under  some  constraints.  For  a TTFS  imager,  the  reference  voltage  is  running 
in  the  continuous  mode,  therefore  we  cannot  set  individual  capture  times  as  required 
in  the  above  two  cases.  In  this  chapter,  we  will  systemically  analyze  an  optimal 
two  segment  piece-wise  linear  strategy  for  the  reference  voltage  (or  equivalently,  well 
capacity)  variation  using  the  expected  SNR  as  the  objective.  In  the  following  sec- 
tions, we  will  first  formulate  this  optimization  problem.  An  off-line  optimal  strategy 
is  discussed  in  section  4.2,  provided  the  required  minimum  time  interval  and  the 
real  comparator-introduced  time  delay  are  given.  In  section  4.3,  an  on-line  optimal 
strategy  is  given  by  incorporating  the  nonuniform  quantization  noise. 
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Figure  4-1:  SNR  vs.  photocurrent  for  different  tchange^-  Here,  Qweii  = 110,000e  and 


t 


frame 


= 33ms. 


4.1  Problem  Formulation 

Including  all  noise  sources,  SNR  can  be  expressed  as 


SNR{t^n)  = T 


{}ph  ~h  idark'j^Q  4“  4”  ^ I 


(4.1) 


'Q  1 FPN 

where  ctq  and  (Tfpn  are  the  quantization  noise  and  the  FPN  respectively.  It  has  been 
shown  earlier  that  a TTFS  imager  enjoys  better  SNR  than  conventional  CMOS  APS 
imagers.  However,  this  achievement  depends  on  reference  voltage  selection  (see  figure 
4-1).  Thus,  our  optimization  problem  can  be  formulated  as  follows: 

For  a two  segment  piece-wise  linear  reference  voltage,  find  the  optimal  t change, 
which  maximizes  the  expected  SNR  under  some  constraints. 

In  the  following  discussion,  we  will  address  well  capacity  instead  of  reference 
voltage  (see  figure  4-2).^ 


^ tchange  M hgure  4-2  is  interchangeable  with  Tq  in  figure  3-5. 
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Figure  4-2:  Varying  well  capacity  for  a wide  dynamic  range 


4.2  Optimal  Strategy  for  Uniform  Quantization 

Suppose  we  uniformly  quantize  the  incident  light  intensity,  which  indicates  that 
the  quantization  noise  Ug  is  the  same  for  different  photocurrents.  In  addition,  for 
simplicity  of  discussion,  we  ignore  all  the  noise  terms  except  for  the  photocurrent  shot 
noise.  If  we  define  ichange  — Q well change,  then  SNR  as  a function  of  photocurrent  is 


Q 


well 


SNR{ij,h)  = 


^phiQwell  3“  change') 


if  ^ ^change 


if  ^ph  ^change 


Q^iph  4“ 

Note  that  a large  photocurrent’s  SNR  reaches  the  maximum  value,  while  a small 
photocurrent’s  SNR  directly  depends  on  tchange,  or  the  slope  k Qweii 


t 


frame  tchange 


4.2.1  Optimal  Strategy  under  Minimum  Time  Interval  Constraint 

If  we  suppose  the  incident  illumination  pdf  fi{i)  is  zero  outside  of  a finite  length 

interval  {firnin,  tmax),  where  imin  ^ Qwell/tframe  9.nd  irnax  ^ Qwell/ 1 frame,  then  this 

optimization  problem  can  be  expressed  as  follows: 

For  a two  segment  piece-wise  linear  well  capacity,  given  the  photocurrent  quan- 
tization step  AI  and  the  required  minimum  time  interval  Atmin,  the  optimal 
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t change!  which  maximizes  the  expected  SNR 

E{SNR)  = ^ fi{i)  di  + T"""  AQn,ell  + kt change)  2) 

•kichange  ^ i-min  q[i  + k) 

subject  to  imin  El  ichange  E imax!  time  interval  At  between  two  adjacent 

quantized  photocurrents’  firing  times  is  no  less  than  Atmim  ke.,  At  > Atmin- 

The  minimum  time  interval  Atmin  is  introduced  by  the  hardware  implementation 
limitation.  Any  two  adjacent  quantized  photocurrents’  firing  times  should  widely 
spread,  otherwise  we  cannot  separate  them  apart  in  the  time  domain.  Obviously  for 
all  photocurrents,  SNR{iph)  < Qweii/q-  Thus  the  upper  bound  of  E{SNR)  is  Qweii/q- 
If  imin  > Q well/ 1 frame,  then  the  Optimal  solution  is  simply  Change  = Q well/ 1 frame,  and 

E{SNR)  = j """  ^^fi{i)  di  = (4.3) 

d imin  ^ ^ 

For  this  case,  the  expected  SNR  achieves  its  upper  bound.  Actually,  such  scene 
information  about  the  incident  light  intensity  should  be  known  in  advance,  so  it  is 
generally  an  on-line  method.  However,  our  goal  here  is  to  provide  an  off-line  strategy. 
Suppose  imin  is  less  than  ichange,  and  we  rearrange  the  objective  function  as  follows: 

E{SNR{ichange))  = 


The  above  equation  shows  that  if  we  could  make  k as  large  as  possible  and  simul- 
taneously make  ichange  as  small  as  possible,  the  expected  SNR  could  be  maximized 
even  though  the  exact  information  of  pdf  fi{i)  is  unknown  (see  figure  4-3).  As  we 
discussed  earlier,  the  beauty  of  a still  mode  time-ba.sed  image  sensor  is  to  allow  each 
pixel  to  reach  the  full  capacity,  thus  resulting  in  a high  DR  and  good  SNR.  So  for 
video  mode  applications,  we  also  need  to  make  each  pixel’s  available  well  capacity  as 
large  as  possible.  Figure  4-4  shows  that  the  expected  SNR  monotonically  increases 


Qwell  r change  kl^Q'cuell  H'change') 


-/ 

JIt 


q{i  + k) 


fi{i)  di 


Qwell  r change  i / ichange') 

Q Jimin  ^iQk  + 1) 


fi{i)  di  (4.4) 
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with  tchange  by  assuming  the  scene  has  a uniform  pdf.  Even  though  the  scene  pdf  will 
not  be  uniform  in  practice,  a large  tchange  (or  slope  k)  can  still  guarantee  that  each 
pixel  will  reach  its  maximum  well  capacity  and  achieve  the  best  SNR. 


Figure  4-3:  Illustration  of  optimization 


The  general  solution  derived  in  the  previous  discussion  indicates  that  the  optimal 
solution  for  slope  k is  infinity  without  any  other  constraints.  However,  in  this  case, 
all  the  photocurrents  below  ichange  will  be  quantized  to  an  undesirable  single  value. 
To  solve  this  problem,  we  introduce  the  minimum  required  time  interval  constraint 
here,  named  Atmin-  Two  photocurrents  with  AI  difference  can  be  differentiated  in 
the  time  domain  as  long  as  their  bring  times’  interval  is  larger  than  Atmin-  Suppose 
we  have  two  photocurrents  Im  and  Im+i,  which  are  related  by 

Im+l  = -fm  + AI  (4.5) 

Then  if  the  well  capacity  stays  constant,  we  hnd  that  the  time  interval  Atm  between 
the  two  photocurrents  is 

Atm  = tm-  tm+l  = (4.6) 

Clearly,  a large  photocurrent  Im  implies  a small  time  interval  Atm-  In  order  to  obtain 
all  the  useful  information,  we  must  ensure  each  time  interval  is  larger  than  the  lower 
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Figure  4-4:  Expected  SNR  vs.  tchange-  Here,  Qweii  = H0,000e  and  tframe  — S3ms. 
Assume  uniform  pdf  6 (10/A,  1000/A). 


bound  Atmin-  A detailed  analysis  of  the  time  intervals  of  large  photocurrents  can  be 
found  in  [31].  Here  we  simply  skip  this  part.  For  the  photocurrent  pair  /„  and  Im+i 
that  are  both  less  than  ichange^  their  time  interval  is 


Atm  = t m — t 


m+l 


{Qwell  4“  ^A/iange)  ' A I 
{Im+1  + k){Im  + k) 


(4.7) 


It  can  be  found  that  the  shortest  time  interval  is  around  the  point  where  the  well 
capacity  starts  to  change.  Now  considering  the  photocurrent  pair  ichange  and  ichange  ~ 
A I,  we  have 


At 


worst  . 


Qwell  4“  ktcfiange  Qwell 


Qhange  AI  -|-  k ichange 


(4.8) 


In  general,  k ^ AI,  then  it  yields 


At 


worst 


AI{ktfr 


QwellY 


kH 


frame 


(4.9) 


optimal  slope  k (e'/ms) 
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Let  Atyjorst  > Atmin,  then  we  have  the  following  inequality 

AI  {kt  Qwell) 


kH 


> Atr. 


frame 


(4.10) 


It  provides  an  upper  bound  kmaxi  for  the  slope  k with  respect  to  the  required  mini- 
mum  time  interval  Atmin  and  the  given  photocurrent  resolution  AI.  In  other  words, 
kmaxi  is  the  optimal  solution.  Intuitively,  a small  required  minimum  time  interval 
leads  to  a large  optimal  slope  k,  which  means  an  inhnitely  fast  clock  and  an  infinitely 
large  throughput  result  in  an  inhnite  k.  Figure  4-5  verifies  this  intuition.  In  con- 
trast, figure  4-6  shows  that  the  optimal  solution  monotonically  decreases  with  the 
photocurrent  resolution  AI.  That  is,  we  need  to  extend  the  time  interval  for  two 
adjacent  photocurrents  with  small  difference. 


X 10* 


(a)  (b) 

Figure  4-5:  Effect  of  Atmin  on  the  optimal  solution  for  uniform  quantization,  (a) 
effect  of  Atmin  on  kmaxi'  (h)  effect  of  Atmin  on  tc/ianpe*  Here,  tjTdme  — 33?ns,  AI  — 
lOfA  and  Qweii  = 110,000e“. 


4.2.2  Comparator  Delay  Considerations 

In  a typical  TTFS  imager  design,  the  pixel  level  comparator  also  works  as  an 
opamp  to  implement  an  autozeroing  technique  for  the  offset  FPN  reduction.  Also 
due  to  power  conservation  considerations,  the  resulting  -3dB  frequency  is  so  low 


skipe  K (e'/ms) 
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(a)  (b) 


Figure  4~6:  Effect  of  AI  on  the  optimal  solution  for  uniform  quantization,  (a)  effect 
of  AI  on  kmaxi-  (b)  effect  of  AI  on  tchange-  Here,  t frame  — 33ms,  AI  = 10 f A and 
Q^eii  = 110,000e-. 


that  comparator-introduced  time  delay  is  not  negligible.  So  in  this  section,  we  will 
investigate  the  effect  of  comparator-introduced  delay  on  firing  times. 

According  to  the  small-signal  model  of  comparator,  when  the  input  signal  ap- 
proaches the  reference  signal,  a comparator  can  be  viewed  as  a low-pass  filter  given 

by 


f{s)  = 


(4.11) 


1 -I-  s/Wp 

where  A is  the  small-signal  comparator  gain,  and  Wp  is  the  -3dB  frequency.  Assuming 
a photocurrent  i,  a photodiode  associated  capacitance  C and  a small  signal  region 


< a,  then  the  small  signal  of  the  comparator  input  can  be  seen  as 


Vin{t)  = h-t  - a 


(4.12) 


where  h = ijC.  According  to  Kirchhoff’s  current  law,  this  system  can  be  also  de- 
scribed as  a differential  equation 

dVoutit)  AVin(t') 


dt 


T 


(4.13) 
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Here  r = 1/wp,  and  Vout{')  is  the  small  output  signal.  The  solution  to  this  well-known 
differential  equation  turns  out  to  be 

Vout{t)  = Avin{t)  - Ahr  + (4.14) 

With  the  initial  conditions  Uout(O)  = — 6,  yj„(0)  = —a  and  b ~ Aa,  we  have  c = 
Ahr  + Aa  — 6 ~ Ahr.  Then  the  resulting  Vout{t)  will  be 

Vout{t)  = Avin{t)  - AhT{\  - (4.15) 

If  we  define  ti  as  the  time  when  the  input  reaches  the  reference  voltage  (i.e.,  the  small 
signal  Vin{ti)  — 0)  and  suppose  the  output  small  signal  reaches  0 at  ^2,  we  have 

Afd  = t2  - = r • (1  - (4.16) 

where  is  the  time  delay  between  t2  and  ti.  The  key  observation  here  is  that  the 
time  delay  will  roughly  equal  r if  ^2  > 4r.  This  has  been  verified  by  our  CADENCE 
simulations.  The  analytic  solution  of  t2  can  be  obtained  from  the  following 

ht2  — a = hr{l  — (4-17) 

It  shows  that  t2  is  a function  of  h or  i,  so  is  Atd-  Erom  the  time  delay  point  of  view, 
an  increased  i will  result  in  a small  delay.  Provided  that  the  real  comparator  delay  is 
on  the  same  order  of  the  required  minimum  time  interval  Atmin,  we  have  to  include 
this  time  delay  constraint  under  consideration.  The  real  readout  time  stamp  now 
becomes  T'  = T + Atd{i),  where  T is  the  ideal  firing  time,  and  Atd{-)  is  the  real 
comparator-introduced  delay.  As  before,  the  serious  effect  of  the  comparator  delay 
on  the  time  interval  happens  around  the  varying  well  capacity  start  point.  To  see 
this  effect,  we  rewrite  the  worst  time  interval  as 

ACorst  = ~ + ^tdilchange  ~ AI)  - Atd{lchange)  (4.18) 
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Assume  Atd{ichange)  IS  Only  decided  by  ichange,  while  both  ichange  — and  slope  k are 
the  contributors  for  Atfi{ichange  — ^I)-  As  k increases,  Atd{ichange  — becomes  much 

less  than  Atd{ichange) , which  in  turn  results  in  a much  smaller  actual  time  interval. 
In  this  case,  inequality  4.10  no  longer  holds.  To  avoid  this  problem,  there  should 
be  another  upper  bound  for  slope  k,  which  ensures  the  comparator-introduced  time 
delay  for  each  photocurrent  is  roughly  the  same.  As  shown  previously,  if  t2  is  larger 
than  4r,  then  the  time  delay  can  be  treated  as  a constant  r.  Now  replacing  h with 
{^change  ~ AI  + k)/C,  equation  4.17  yields 


{icbanst  - A/  + k)t2/C  -Q  = {ichmgt  “ A/  + k) / C ■ t(1  - e 


(4,19) 


To  ensure  t2  greater  than  4r,  it  must  have^ 


^change  AI  k ^ 


a-C 

3t 


(4.20) 


Substituting  ichange  with 
inequality  becomes 


Qwell 


i frame  Q well  / k 


and  assuming  ichange  ^ A/,  then  the  above 


k^--^C-k  + ^^—-C<3 

OT  t frame  3t 


(4.21) 


It  gives  us  another  upper  bound  kmax2,  which  i 


IS 


(4.22) 


The  two  upper  bounds  for  the  slope  k occur  under  two  different  constraints. 
We  simply  take  the  minimum  of  the  two  upper  bounds  as  our  final  solution  for  this 
optimization  problem,  which  is 


kopt  niin(^maxl)  kjYiax2^ 


(4.23) 


In  the  derivation,  we  make  the  following  approximation  1 — e ~ 1. 


2 


optimal  slope  k (e*/ms) 
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From  the  simulation  results  shown  in  figure  4-7,  we  observe  that  the  optimal 
solution  is  merely  decided  by  the  required  minimum  time  interval  when  it  is  large 
enough.  Meanwhile,  the  time  delay  constraint  plays  a key  role  in  determining  the 
optimal  solution  if  the  required  minimum  time  interval  is  much  smaller.  In  figure 
4-8,  optimal  solution  vs.  A/  is  plotted.  It  shows  that  the  required  minimum  time 
interval  constraint  dominates  in  the  small  A/  region,  whereas  the  optimal  solution  is 
mainly  decided  by  the  time  delay  constraint  for  a large  A/. 


X 10* 


Figure  4-7:  Optimal  slope  k and  tchange  vs.  Atmin  under  time  delay  constraint.  Here, 
a = lOmH,  tframe  = 33ms,  r = 1/rs,  A/  = lOfA  and  Qyjeu  = 110,000e~. 


4.2.3  Optimization  Steps  for  Uniform  Quantization 

In  summary,  this  optimization  problem  can  be  solved  with  the  following  steps: 


1.  Compute  the  upper  bound  kmaxi  under  the  required  minimum  time  interval 
constraint  for  the  ideal  comparator  case  using  inequality  4.10; 

2.  Compute  the  upper  bound  kmax2  under  the  time  delay  constraint  using  inequal- 
ity 4.22; 

3.  Take  the  minimum  of  the  two  upper  bounds  as  the  final  optimal  solution  kopt, 
and  the  optimal  tchange  is  related  by 


Qwell 


''opt 


^change  — tf^came 


optimal  slope  k (e'/ms) 
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(a)  (b) 

Figure  4-8:  Optimal  slope  k and  tchange  vs.  A/  under  time  delay  constraint.  Here, 
a = lOmH,  tframe  = 33ms,  T = l^s,  At  = 50/rs  and  Qweii  = 110,000e“. 


4.2.4  Discussion 

This  optimization  method  does  not  need  any  specific  information  about  the  scene 
illumination,  so  it  is  an  off-line  method.  Certain  scene  illumination  information, 
however,  is  helpful  to  solve  this  optimization  problem.  When  the  readout  noise  and 
the  FPN  are  not  negligible,  the  upper  bound  of  the  expected  SNR  will  decrease,  but 
the  final  optimal  solution  still  holds.  The  constraints,  we  consider  here,  are  all  time 
related.  In  addition,  if  a high-speed  comparator  with  a time  constant  of  several  ns  is 
used,  the  time  delay  will  no  longer  be  an  issue.  To  achieve  a large  safety  margin,  we 
can  let  t2  > 5r.  Then  the  upper  bound  kmax2  will  be 


k 


rnax2 


-2[Tr^^ 


4t 


-C- 


-—C  4 ^ 

4"^  i frame  J 


(4.24) 


4.3  Optimal  Strategy  for  Nonuniform  Quantization 

Pixel-parallel  automatic  gain  control  is  an  inherent  characteristic  of  time-based 
image  sensors  [27].  Uniform  quantization  along  the  time  axis  will  result  in  nonum- 
form  quantization  in  the  photocurrent  domain.  Generally,  nonuniform  quantization  is 
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Figure  4-9:  Mapping  between  photocurrents  and  firing  times.  Here,  Aume  = 0.5ms, 

^change  lOmS  and  t frame  ~ 33m S 

beneficial  to  reduce  the  quantization  noise  for  the  predominantly  weak  photocurrents 
at  the  expense  of  an  increase  in  noise  for  the  rarely  occurring  strong  photocurrents. 
The  relationship  between  photocurrents  and  corresponding  quantization  values  in  the 
time  domain  is  illustrated  in  figure  4-9.  It  has  a typical  compression  characteristic, 
having  a much  steeper  slope  for  small  magnitude  photocurrents  than  that  for  large 
magnitude  photocurrents.  Thus  a given  signal  change  in  the  small  magnitude  region 
will  carry  the  uniform  quantizer  through  more  steps  than  the  same  change  in  the 
large  magnitude  region.  Similar  to  standard  compression  curves  used  in  voice  trans- 
mission, e.g.,  /r-Law  and  A-Law,  varying  the  reference  voltage  also  aims  to  achieve 
continuity  at  the  origin  of  the  photocurrent.  Unlike  the  uniform  quantizer,  it  is  not 
suitable  for  nonuniform  quantizer  to  use  a single  average  quantization  noise  power  for 
different  input  signal  levels.  For  a midrise  quantizer,  the  mean  squared  error  (MSE) 
for  a signal  in  the  quantizing  interval  is  computed  as 

^a:r+Ar/2 

~ ^tYPt  dx  (4.25) 

J Xr—Arl2 
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Figure  4-10:  SNR  vs.  photocurrent  when  including  the  quantization  noise.  Here, 
^fraTne  33777-5,  ^chaTige  ^TTIS  and  Q'ujell  110,  OOOc 


where  A,,  is  the  step  size  of  quantizing  interval,  and  Pr  is  the  input  signal  pdf. 
Assuming  the  signal  is  uniformly  distributed  over  [xr  — A^/2,  Xr  + Ar/2],  we  have 


A2 

^,(0  = ^ 

Here  we  regard  MSE  as  the  quantization  noise,  which  is 

2,-1 

= l2 


(4.26) 


(4.27) 


Ar  = /r+1  — Ir  and  Ir  < i < Ir+1 

where  Ir  and  Z^+i  are  quantization  levels,  decided  by  the  well  capacity  varying  curve 
and  the  quantization  step  size  Aume  in  the  time  domain.  They  can  be  expressed  as 

Qwell 


Ir  = ( 


^ ■ Afiryig 

Qwell  (t/  r 


if  V • A+irrt.p_  t 


time  — ^change 


(4.28) 


^ ■ Ajjyjje) 
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time  ^ I'change 
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Figure  4-11:  Comparison  of  SNRs  for  different  tchangeS.  (a)  Aume  = 3//s,  (b)  Aume  = 
O.Sfis.  Here,  tframe  = 33ms  and  Qweii  = 110,000e“. 


The  above  equation  indicates  that  the  well  capacity  varying  curve  not  only  decides 
the  quantization  levels  but  also  determines  the  quantization  noise.  Keeping  the  quan- 
tization noise  and  the  photocurrent  shot  noise,  we  rewrite  the  expression  of  SNR  as 
follows: 


SNR{'i)  = ^ 


Rg  + cjJ(z)-f2 

The  new  SNR  expression  is  plotted  in  figure  4-10.  The  simulated  SNR  illustrates  that 
when  the  quantization  noise  is  quite  small,  the  shot  noise  dominates  the  noise  power, 
and  when  the  quantization  noise  is  comparable  to  the  shot  noise,  SNR  is  dramatically 
degraded.  If  a long  tchange  is  applied,  a large  quantization  step  size  is  achieved  for  the 
small  magnitude  photocurrents,  which  also  leads  to  a large  quantization  noise  and  a 
worse  SNR.  On  the  contrary,  the  same  long  tchange  benehts  the  SNR  of  the  medium 
magnitude  photocurrents  thanks  to  the  large  achievable  well  capacity  (see  figure  4-1 1 
(a)). 


Including  the  quantization  noise,  our  optimization  problem  becomes 
For  a two  segment  piece-wise  linear  well  capacity,  given  the  time  quantization  step 
size  Atime  the  incident  light  pdf  fi{i),  find  the  optimal  tchange,  which  maximizes 
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Figure  4-12:  Photocurrent  histogram  of  vinesunset 


the  expected  SNR 


E{SNR) 


SNR{i)  • fi{i)  di 


(4^30) 


subject  to  0 S:  ^change  ^ ^ frame' 

where  SNR{i)  is  given  in  equation  4.29.  If  Aume  is  small  enough,  SNR{i)  will 
it 

reduce  to  — (see  figure  4-10  and  hgure  4-ll(b)).  The  optimization  here  needs  to 
know  the  pdf  of  the  incident  light,  so  it  is  in  general  an  on-line  method.  This  rough 
scene  information  may  be  gathered  during  the  previous  captures.  In  the  following 
discussion,  we  will  investigate  the  optimization  problem  by  assuming  the  complete 
scene  illuminance  information  is  known  in  advance. 

Suppose  fl{i')  is  zero  outside  {imini  imax)-  As  before,  if  imi-a  ^ Qwell/t frame) 
then  the  solution  of  equation  4.30  is  simply  tchange  = t frame-  A decreased  tchange 
increases  the  small  signal’s  SNR  but  degrades  the  large  signal’s  SNR.  Besides,  the 
expected  SNR  (ESNR)  depends  on  not  only  tchange  but  also  the  incident  intensity 
pdf.  Though  few  natural  scenes  exhibit  uniform  illumination  statistics,  any  pdf  can 
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Figure  4-13:  ESNR  vs.  tchange-  For  ^ume  = 0.5/is,  optimal  tchange  = 32.3ms;  and  for 
\ime  = lOpS,  optimal  tchange  = 28.2mS. 


be  approximated  by  a piece- wise  uniform  pdf.  Then  assuming  that  the  pdf  is  uniform 
over  {imini,  imaxi),  ' ' ' , {iminN , imaxN),  the  objective  function  becomes: 


N 


E{SNR)  = 


r^maxk 


Pk  ■ 


k=l 


'^mink 


itq  + al{i)  ■ 


di 


(4.31) 


where  the  constant  pk  is  the  pdf  value  over  the  range  {imink,  imaxk)- 

Vinesunset  is  a high  dynamic  range  image,  which  will  be  introduced  in  Chapter 
6.  Its  original  pixel  values  are  expressed  with  floating  point  values  and  proportional 
to  the  true  light  intensities.  In  our  simulation,  each  value  is  simply  scaled  by  lOOx 
where  is  the  dark  current,  i.e.,  id  = 1/.4.  Figure  4-12  shows  the  histogram  of  the 
scaled  vmesunset’s  light  intensities,  which  can  be  approximated  by  a five  segment 
piece-wise  uniform  pdf,  i.e.,  (id,  40id),  (40zd,  200zd),  (200id,  750zd),  (750id,  1500zd) 
and  (1500zd,  1900zd).  We  plotted  equation  4.31  with  respect  to  tchange  in  figure  4-13, 
which  is  obviously  a convex  curve.  The  optimal  solution  is  the  tchange  corresponding 
to  the  maximum  ESNR.  Two  different  AumeS  are  compared  here,  of  which  the  smaller 
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step  size  enjoys  the  better  ESNR  because  of  its  lower  quantization  noise.  Additionally, 
figure  4-13  also  shows  that  with  integration  time  increasing,  an  enlarged  well  capacity 
benefits  ESNR  as  expected,  and  when  tchange  becomes  too  large,  quantization  noise 
is  pronounced  and  ESNR  dramatically  drops. 

4.4  Summary 

This  chapter  presented  a systematic  study  of  an  optimal  two  segment  piece-wise 
linear  strategy  for  the  reference  voltage  (or  well  capacity)  variation.  For  uniform 
quantization,  an  optimal  solution  is  achieved  by  considering  the  minimum  time  inter- 
val requirement  and  the  real  comparator  delay.  This  off-line  method  does  not  require 
any  explicit  incident  light  intensity  information.  A time-based  method  essentially 
implements  nonuniform  quantization  of  photocurrent,  which  benefits  weak  signals’ 
SNR,  and  therefore  the  expected  SNR.  Given  the  pdf  of  photocurrents,  the  optimal 
tchange  Can  be  determined  by  maximizing  the  ESNR.  An  approximate  scene  statistics 
could  be  achieved  by  the  previous  captures.  When  the  reset  noise  and  the  FPN  are 
not  negligible,  the  ESNR  will  be  further  degraded,  however  the  discussed  optimal 
strategy  still  holds. 


CHAPTER  5 

TTFS-CLASSIC  IMAGER  DESIGN 


A 128  X 128  TTFS_classic  imager  was  designed  using  TSMC  0.18/im  digital 
CMOS  technology.  In  this  chapter,  the  design  is  discussed  at  the  circuit  level.  Section 

5.1  details  the  pixel  design,  followed  by  single  pixel  test  results  presented  in  section 
5.2.  A detailed  description  of  the  asynchronous  readout  circuitry  design  and  layout 
considerations  are  included  in  section  5.3  and  section  5.4  respectively.  Section  5.5 
demonstrates  the  prototype  chip  test  results.  Finally,  this  chapter  concludes  with 
section  5.6. 


5.1  Pixel  Design 


5.1.1  Pixel  Operation 

The  pixel  schematic  of  the  TTFS_classic  imager  is  shown  in  figure  5-1,  which 
contains  a photodiode,  a comparator  and  a digital  control  circuit.  Note  that  the 
signals  labelled  with  ~ are  active-low.  A global  control  logic  provides  Vref  or  Preset  to 
each  pixel  according  to  the  control  signal  There  are  two  phases  in  each  frame, 

i.e.,  reset  phase  and  comparison  phase.  The  pixel  works  as  follows: 

1.  The  photodiode  is  initially  reset  to  Keset  via  a negative  feedback  loop. 

2.  After  the  goes  high,  the  photodiode  is  discharged  and  the  voltage  across 
the  photodiode  linearly  drops.  When  the  voltage  drops  below  Vref,  the  com- 
parator output  node  flips  and  the  node  req  goes  high.  Then  the  pixel  sends  out 
a request  to  the  row  arbiter  by  pulling  down  r^rowjrequest{m). 

3.  If  the  row  arbiter  selects  this  row  by  making  row_select{m)  high,  column  request 
^coLrequest{n)  signals  will  be  sent  to  the  column  latch. 

4.  After  '^coLrequest{n)  signals  are  latched,  the  corresponding  control  signals, 
disable{m)  and  coLreq{n),  are  generated  to  disable  the  pixel  by  switching  on 
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transistor  M2.  Thus  the  pixel  will  not  fire  again  until  the  next  reset  phase  turns 
off  M2. 


1.8  V 


Figure  5-1:  Pixel  schematic  of  the  TTFS_classic  imager 


We  implement  the  front-end  circuit  using  standard  thick  oxide  (3.3V)  transistors 
(labelled  with  *)  to  avoid  the  high  gate  and  subthreshold  leakage  currents.  Imple- 
menting the  comparator  using  thick  oxide  transistors  also  makes  it  possible  to  use 
the  high  power  supply  (3.3V)  to  increase  the  signal  swing.  In  order  to  shift  down  the 
high  voltage  to  nominal  1.8V  supply,  INVl  with  thick  oxide  transistors  is  included 
working  as  a level  shifter.  The  positive  feedback  following  the  comparator  is  used 
to  make  the  comparator  output  node  immune  to  the  switching  noise.  Note  that  the 
control  signal  ^isolation,  which  is  the  same  as  rst  except  for  having  a delayed  falling 
edge,  is  used  to  disable  the  positive  feedback  during  the  reset  phase.  The  pixel  layout 
is  shown  in  figure  5-2,  which  is  12.4  x 12.1/[i77i^  with  2.99  x 2.97/rm^  light  sensing- 
area  and  has  a fill  factor  of  about  6%.  Pixels  are  mirrored  in  the  array  in  order  to 
share  the  n-well  and  some  of  power  and  bias  lines.  To  prevent  light-induced  currents 
from  affecting  the  analog  circuitry  or  causing  latchup,  we  used  the  sixth  layer  metal 
everywhere  except  over  the  photosensitive  node. 
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Figure  5-2:  Pixel  layout  of  the  TTFS_classic 


Our  first  pixel  design  in  TSMC  0.18/rm  digital  CMOS  technology  contains  a 
slow  opamp/comparator  without  positive  feedback,  as  shown  in  figure  5-3.  The 
testing  results  show  that  the  firing  time  is  inversely  proportional  to  the  light  intensity. 
However,  there  exists  a problem.  Several  unexpected  oscillations  or  pulses  come  out 
following  the  expected  one.  It  turns  out  that  the  period  of  oscillation  is  related  to 
the  light  intensity,  that  is  the  weaker  the  light,  the  longer  the  oscillation.  Including 
the  corner  analysis,  we  re-simulate  the  pixel  with  CADENCE,  and  found  that  the 
oscillation  is  due  to  the  voltage  of  the  comparator  output  node,  especially  for  the  Slow 
NMOS  and  Slow  PMOS  corner.  The  reason  is  that  when  the  comparator  flips,  in 
some  period  the  output  voltage  happens  to  be  in  the  transition  region  of  the  following 
inverter.  Since  the  PADOUT  pad  has  parasitic  inductance,  when  the  huge  transient 
DC  current  generates,  the  chip  GND  or  VDD  will  oscillate,  and  that  in  turn  will  cause 
the  output  node  of  the  comparator  to  oscillate.  This  is  the  so  called  Simultaneous 
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Switching  Noise  (SSN)  impact.  Since  a weaker  light  generates  a smaller  current, 
the  diode  discharges  slowly,  therefore  the  comparator  output  stays  in  the  transition 
region  longer.  To  solve  this  problem,  we  adopted  a relative  fast  opamp/comparator 
followed  by  a positive  feedback  in  our  later  design,  which  will  be  analyzed  in  section 
5.1.4.  Both  the  fast  comparator  and  the  positive  feedback  are  helpful  to  speed  up  the 
comparator  out  voltage  transition,  thus  reducing  the  oscillations. 


Vdd  Vdd 


Figure  5-3:  8-transistor  opamp 

5.1.2  Photodiode  Design 

As  discussed  in  Chapter  2,  the  photodiode  can  be  formed  by  psub/nwell,  psub/n-|- 
or  p-h/nwell.  To  widen  the  depletion  region  to  favor  the  photocurrent  generation,  we 
choose  a psub/nwell  diode  shown  in  figure  5-4  for  our  TTFS_classic  imager. 

The  total  capacitance  at  the  cathode  of  the  photodiode  is  the  summation  of  the 
junction  capacitance  of  the  psub/nwell  diode,  the  comparator  input  PMOS  transis- 
tor’s gate  capacitance,  the  MOS  capacitor  C'a/i,  and  the  drain  capacitance  of  the 
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To  comparator,  re.set,  and  Ml 


N-" 


Nwell 


Psub 

Figure  5-4:  Diagram  of  a psub/nwell  photodiode 
reset  PMOS  transistor  MO,  which  is 


Cpd  Cj  + Cg  + C MO, drain  d”  Cm\ 


(5.1) 


According  to  the  MOSIS  provided  electrical  parameters  of  the  TSMC  O.lS^m  digital 
CMOS  technology,  i.e.,  the  unit  area  capacitance  between  nwell  and  substrate  is 
IQaF / jim?' , Cj  is  approximately  0.62/F,  which  is  quite  small  compared  to  other 
components  in  The  MOS  capacitor  Cmi  may  exhibit  strong  voltage  modulation 
effects  [39].  Figure  5-5  shows  the  simulated  capacitance  curve  for  Cmi  when  the 
gate  bias  is  positive.  We  observe  that  beyond  strong  inversion,  i.e.,  V^ias  > IV,  the 
capacitance  is  nearly  constant.  As  discussed  in  Chapter  2,  to  achieve  good  linearity 
for  the  photodiode  output  voltage  with  respect  to  the  integration  time,  the  MOS 
capacitor  Cmi  has  to  be  constant.  Then  this  bias  dependent  property  of  the  MOS 
capacitor  sets  the  lower  bound  of  reference  voltage,  i.e.,  Vref  = IV.  In  addition,  an 
increased  well  capacity  has  two  other  advantages:  first,  increase  SNR  and  second, 
push  strong  light  intensities  away  from  the  short  firing  time  region,  thus  obtaining 
lower  reconstruction  errors. 
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Figure  5-5:  General  behavior  of  the  MOS  capacitor  Cmi 

5.1.3  Autozeroing  Reset 

For  most  conventional  CMOS  APSs,  the  reset  process  is  simply  performed  by  a 
PMOS  or  NMOS  transistor,  and  the  hxed  pattern  noise  is  reduced  by  correlated  dou- 
ble sampling(CDS)  the  output  analog  voltage.  However,  for  TTFS_classic  imagers,  a 
digital  signal  is  directly  read  out  from  a pixel,  so  it  is  impossible  to  apply  CDS.  By 
noticing  that  the  major  FPN  in  TTFS_classic  imagers  comes  from  the  comparator 
random  offset,  we  adopt  a typical  DC  offset  reduction  method,  autozeroing  (AZ)  [40], 
to  reset  the  photodiode. 


Figure  5-6:  Autozeroing  offset  cancellation  principle 
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During  the  reset  phase,  the  comparator  is  disconnected  from  the  signal  path  and 
connected  as  a unity-gain  configuration  as  shown  in  figure  5-6.  Assuming  that  the 
steady  state  has  been  achieved,  the  voltage  Vpd  obtained  across  the  capacitor  Cpd  is 

^pd  = yreset  + ^'^"^^0//  (^’2) 


After  the  reset  phase,  MO  is  open  and  the  comparator  is  connected  again  to  the 
signal  path.  Thus,  the  offset  Voff  is  stored  on  the  capacitor  if  the  comparator  gain  A 
is  large  enough.  However,  there  is  still  an  offset  residue  due  to  the  finite  opamp  gain 
A,  which  is 


error,  1 


off 


1 + A 


(5.3) 


In  addition  to  the  finite  gain,  the  charge  injection  from  switch  MO  and  the  clock 
feedthrough  also  cause  errors.  The  total  injected  inversion  charge  is  estimated  to  be 


Qinj  ^C*ox kffoTo( lAgsgj  -f-  Vth,po) 


(5.4) 


where  a is  the  proportion  of  MO’s  channel  charge  transferred  to  the  capacitor  Cpd, 
Wq/Lq  are  the  width  and  length  of  MO  respectively,  and  Vth,po  is  the  threshold  voltage 
of  PMOS  transistor  MO  and  has  a negative  value.  Then  the  resulting  error  due  to 
the  charge  injection  is  AVerror,2  — Qinj  I Cpd- 

The  clock  feedthrough  generates  when  MO  couples  the  clock  transition  to  the 
sampling  capacitor  through  its  gate-source  overlap  capacitance  Cmio,  and  the  resulting 
error  is 

(5.5) 

( ^ovO 

Then,  the  total  residual  offset  equals 


Vh, 


y°f f (aCgj, lToTo( -|-  Vth,po)  CoyO 

1+^  ^d  ""  Cpd  + CoyO 


(5.6) 


In  general,  a large  opamp  gain  A is  good  for  the  offset  residue  reduction  but  degrades 
the  comparator  speed.  With  consideration  of  the  worst  threshold  variations,  the 
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opamp  open  loop  gain  is  required  to  be  at  least  100  (see  Appendix  A).  As  the  opamp 
unit  bandwidth  /o  decides  the  autozeroing  process  settling  time,  a settling  time  of 
tset  = 5/rs  demands  /o  > l/(27rtset/7)  ~ 224kHz  to  achieve  0.1%  reset  accuracy. 

Reset  speed  has  no  specific  requirement  on  the  MO  design.  Charge  injection 
error  and  off-state  leakage  current,  however,  have  great  impacts  on  it.  When  MO  is 
biased  in  the  weak  inversion  region,  the  drain  current  is  given  by  [39] 

Ids  = he//  - l)(t/r)^  exp  ) {}  ~ 

where  fieff  is  the  mobility  of  holes,  Ut  — kT/q  is  the  thermal  voltage,  and  m is  the 
body  effect  coefficient.  It  shows  that  subthreshold  current  exponentially  depends  on 
the  threshold  voltage  and  the  gate-source  voltage.  So  to  achieve  a low  leakage  current, 
a thick  oxide  transistor  with  a high  threshold  voltage  is  needed.  In  addition,  the 
positive  gate-source  voltage  during  the  comparison  phase  is  also  helpful  to  decrease 
the  leakage  current.  The  simulated  leakage  current  is  about  0.083fA  at  T = 27°C. 
It  should  be  noted  that  the  real  leakage  current  may  be  larger  than  this  number  in 
that  high  working  chip  temperature  will  counteract  this  reduction  as  indicated  in  the 
exponential  part  of  equation  5.7.  A high  threshold  voltage  is  also  useful  to  decrease 
the  channel  charge  for  the  same  Vgs,  therefore  reducing  the  charge  injection. 

In  addition  to  the  signal,  noise  is  also  stored  on  the  capacitor  as  well.  Autozeroing 
is  equivalent  to  a high-pass  filtering  process,  thus  the  low  frequency  noise,  i.e.,  1/f 
noise,  is  strongly  reduced  but  at  the  cost  of  an  increased  noise  floor  due  to  aliasing 
the  broadband  white  noise  into  the  base  band  [40]. 

5.1.4  Comparator  Design 

A typical  comparator  utilized  in  A/D  converters  consists  of  a preamplifier  and 
a latch  and  has  two  modes  of  operation:  tracking  and  latching  [41].  It,  however, 
must  work  synchronously,  which  is  not  applicable  in  our  design.  In  a TTFS_classic 
imager,  there  is  no  single  global  capture  time,  so  we  cannot  decide  when  to  latch  the 
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comparison  result.  Since  this  comparator  also  works  as  an  opamp  during  the  reset 
phase,  we  need  to  consider  the  design  requirements  for  both  cases.  The  followings 
are  the  design  criteria  for  this  opamp/comparator: 

1.  Gain 

As  explained  in  Appendix  A,  to  ensure  the  success  of  autozeroing,  the  opamp 
gain  should  be  greater  than  100.  During  the  comparison  phase,  the  comparator 
essentially  implements  a one-bit  comparison,  so  a high  DC  gain  is  not  required 
in  this  phase.  In  summary,  the  opamp/comparator  DC  gain  is  required  to  be 
at  least  100. 

2.  Open-loop  Bandwidth 

In  Chapter  4,  we  have  shown  that  the  comparison  delay  depends  on  the  slew  rate 
of  the  input  signal,  and  the  maximum  delay  is  determined  by  the  comparator 
open-loop  bandwidth.  In  order  to  achieve  good  linearity  in  the  reconstruction, 
we  expect  the  comparison  delay  to  be  no  larger  than  2^s,  which  corresponds  to 
f-3dB  = SOkHz. 

3.  Close-loop  Bandwidth 

As  discussed  earlier,  the  reset  phase  length  requires  the  close-loop  bandwidth 
is  no  less  than  224kHz,  which  can  be  easily  met. 

4.  Power  and  Size 

The  comparator  needs  to  be  biased  in  the  subthreshold  region  to  minimize 
power  consumption.  Pixel  size  is  one  of  our  major  concerns,  since  small  pixel 
area  means  good  spatial  resolution.  Thus  the  number  of  transistors  must  be 
minimized. 

In  our  previous  design,  a cascaded  opamp  was  used,  which  has  a very  high  gain 
at  the  expense  of  speed.  To  boost  speed,  bias  current  must  be  increased,  which 
is  obviously  contrary  to  the  low  power  consumption  requirement.  So  in  our  current 
design,  two  types  of  opamp,  shown  in  figure  5-7  and  figure  5-9  respectively,  are  under 
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our  consideration  mainly  due  to  the  speed  and  pixel  size  concerns.  We  will  compare 
these  two  topologies  in  the  following. 

• 6-transistor  Clamped  Opamp 


3.3V  3.3V 


Figure  5-7:  6-transistor  clamped  opamp/comparator 


1.  Gain 

As  required,  all  transistors  are  biased  in  the  weak  inversion  region  (or 
subthreshold  region).  The  PMOS  drain  current  is  given  in  equation  5.7. 
When  Vds  is  much  larger  than  a few  f/^s,  the  current  can  be  generally 
simplified  to 

Ids  = lo  exp  {{-Vgs  + Vth)/mUT)  (5.8) 

where 

Io  = fJ-effCox^{m-l){UT)^  (5.9) 

■no 

It  yields  the  transconductance 

dids  Ids 


(5.10) 
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Given  Ids, pi  = Ids,P2  = Ids,P3  = 0.5/ds_pO)  all  the  transistors  in  figure  5-7 
except  PO  have  the  same  transconductance  gm-  Then  the  opamp  gain  is 


A 


1 ffm 

2 gds,N2  + ffds,P3 

1 


1 uiPt 
2~  ^ 1 


(5.11) 


Va,N2  Va,P3 

where  Va,n2,  Ka.ps  are  the  early  voltages  of  N2  and  P3  respectively.  The 
factor  0.5  comes  from  the  fact  that  only  half  of  the  small  signal  current 
flows  through  transistor  P2  to  the  output  node.  From  CADENCE  simula- 
tion with  Ids,Po  = 268nA  and  Ids,P3  = 144nA,  a 40dB  DC  gain  is  achieved. 
2.  Speed 

Generally,  the  first  pole  is  determined  by  the  dominant  time  constant  at 
the  output  node,  which  is 


T — 

9ds,N2  + 9ds,P3 

where  Cout  is  the  total  capacitance  at  the  output  node  including  the  load 
capacitance  and  the  parasitic  capacitance  of  N2  and  P3.  Then  the  -3dB 
frequency  is  approximately 


(6.12) 


f~3dB  — 


1 

2ttt 


1 


27T- 


1 


(5.13) 


-a 


out 


9ds,N2  + 9ds,P3 

With  the  same  bias  condition,  the  simulated  f-uB  is  170kHz. 

3.  Noise 

Since  this  clamped  opamp  has  an  asymmetric  topology,  the  well  known 
opamp  noise  model  is  not  applicable.  To  see  the  noise  performance,  we 
need  to  analyze  each  individual  component  shown  in  figure  5-8.  In  the 
following  derivations,  we  have  assumed  matching  between  transistor  pair 
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PI  and  P2  as  well  as  in  pair  A^l  and  N2.  It  is  very  straightforward  to 
analyze  the  noise  contributions  from  N2  and  P3.  For  instance,  the 
noise  contribution  from  A^’l  to  the  output  node  is 

Fno  {,9m,Nl^o)^n,N\ 

where  Vno  is  the  output  noise,  Rq  = , and  Vn,m  is  the 

9ds,N2  + 9ds,P3 

equivalent  noise  source  of  transistor  Nl. 

3.3V 


Figure  5-8:  Diagram  of  6-transistor  opamp  noise  sources 

For  PI  and  P2,  notice  that  only  half  of  the  equivalent  noise  current  of  PI 
is  collected  by  the  output  node  because  of  the  asymmetric  topology.  Thus, 
we  have 

^no  {'^9m,PlRo)^n,Pl  (5.15) 

Finally,  the  noise  contribution  from  PO  to  the  output  cannot  be  ignored 
as  in  symmetric  opamps.  Its  noise  current  is  divided  into  two  components; 
one  flows  through  PI  to  the  ground,  and  the  other  flows  through  P2  to 


the  output  node.  As  a result,  the  noise  contribution  from  PO  is  given  by 


Pno  — {,  ^9m,PoPo)Pn,PQ 


(5.16) 


Hence,  the  output  total  noise  power  will  be 


1 


(5.17) 


+ i{9^.mRofVl  Nl  A {9m, P3  Rof  VI 


P3 


This  output  noise  value  can  be  referred  back  to  get  an  equivalent  input 

1 


noise  by  dividing  it  by  the  gain  - {9m,PiRo)  , which  results  in 


K?„  = 2K?p.+ 


neq 


+ 4 


9m,P3 
9m, PI 


9m, PO 
9m,Pl 


^n,P3 


K?,PO  + 8 


9m,Nl 

9m,Pl 


vim  (5.18) 


If  we  size  our  devices  such  that  9m,P3,  9m,Ni,  9m, pq  ^ 9m,Pii  we  can 

minimize  the  noise  contributions  from  devices  Nl,  N2,  P3  and  PO.  This 

can  be  accomplished  by  pushing  those  transistors  into  strong  inversion 

by  making  {W/ L)p3^ni^pq  {W/L)pi.  In  practice,  we  cannot  increase 

their  channel  lengths  arbitrarily  due  to  the  limited  pixel  size.  Thus,  these 

noise  sources  still  dominate.  Unlike  that  in  strong  inversion,  the  equivalent 

2kT 

thermal  noise  source  in  weak  inversion  is  approximately  = [42], 

9m 

which  yields 


4A;T 


9m, PI 


9m, PO  ^ ^9m,Nl 
2^m.Pl  9m,Pl  9m, PI 


(5.19) 


5-transistor  Opamp 

Figure  5-9  shows  the  other  comparator  under  our  consideration,  which  consists 
of  a simple  5-transistor  opamp  shown  inside  the  dashed  box  and  an  inverter 
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working  as  a second  stage.  The  pseudo  positive  feedback  is  used  to  make  the 
output  node  immune  to  the  power  supply  noise. 


Figure  5-9:  5-transistor  opamp/comparator 


1.  Gain  and  Speed 

Similar  to  the  previous  discussion,  the  5-transistor  opamp  gain  is 


9m, PI 

9ds,N2  + 9ds,P2 
1 


2. 


mUr 
1 1 


(5.20) 


^A,N2  Va,P2 

Since  this  5-transistor  opamp  is  a symmetric  topology,  there  is  no  factor 
of  0.5  in  the  gain  expression.  With  a bias  current  of  268nA,  the  simulated 
gain  is  44dB,  and  the  f-^dB  is  222kHz. 

Noise 

The  noise  analysis  for  the  5-transistor  opamp  is  quite  simple.  Assuming 
all  the  transistors  work  in  the  weak  inversion  region,  the  input  referred 
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noise  IS 


(5.21) 


Note  that  the  noise  contribution  from  PO  is  ignored  because  of  the  sym- 
metric topology.  Obviously,  this  input  referred  noise  is  much  smaller  than 
that  of  6-transistor  opamp. 


One  potential  problem  with  the  5-transistor  opamp  is  the  capacitor  cou- 
pling through  the  overlap  capacitor  Covjout  (see  figure  5-10).  After  reset, 
the  opamp  output  voltage  Vout  suddenly  drops  down  from  Vreset  to  almost 
OV.  This  large  swing  of  AVAit  will  introduce  some  error  voltage  expressed 
as 


The  formula  shows  that  a large  Cmi  can  help  reducing  this  error  voltage. 
Cmi,  however,  cannot  be  made  too  large,  since  a very  large  Cmi  will 
decrease  the  fill  factor  and  also  degrade  the  conversion  gain. 


3.  Residue 


(5.22) 


error, A 


Cpd 


Figure  5-10;  Diagram  of  the  additional  capacitor  coupling  in  5-transistor  opamp 
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Comparing  their  performances,  we  prefer  to  use  the  5-transistor  opamp  in  that 
the  high  input  referred  noise  floor  makes  the  6-transistor  circuit  unfavorable.  The 
potential  capacitor  coupling  problem  with  the  5-transistor  opamp  is  not  severe,  since 
it  can  be  reduced  by  increasing  the  MOS  capacitor  Cmi-  By  far,  the  overall  offset 
residue  after  reset  in  terms  of  charge  is  given  by 


In  our  design,  we  used  PMOS  transistors  as  the  input  pair  to  save  the  pixel 
layout  area.  Theoretically,  both  PMOS  and  NMOS  input  pairs  are  able  to  provide 
the  same  signal  swing  if  applying  the  autozeroing  reset  technique.  However,  since  the 
lower  bound  of  K-e/  is  limited  to  about  IV  to  achieve  a constant  capacitance  for  the 
MOS  capacitor,  a NMOS  pair  can  be  adopted  to  further  increase  the  signal  swing 
from  1.4V  to  2V  while  trading  off  a large  pixel  layout  area. 

5.1.5  Digital  Control  Block  Design 

The  digital  control  block  inside  each  pixel  consists  of  some  simplified  digital  logic 
gates.  No  special  design  is  required  except  one  important  issue,  how  to  design  the 
pixel  interface  to  a column  or  row.  To  simplify  the  design  of  the  large  number  of 
input  OR  gates  per  each  row  and  column,  a pseudo-CMOS  logic  is  used  instead.  To 
obtain  a fast  switching  speed,  we  need  to  optimally  size  the  transistors  M6,  Ml  and 
M9  in  figure  5-1.  Since  M6  always  switches  on  ahead  of  Ml,  the  pull  down  delay  is 
almost  the  same  as  that  of  a single  NMOS.  Typically,  the  always-on  PMOS  transistor 
has  only  limited  driving  current  compared  to  NMOS  transistors.  The  pull-down  Tdown 
can  be  expressed  as  [39] 


Cpd  + Cmi 


(5.23) 


(5.24) 
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where  l^sat  is  the  saturation  current  per  unit  width  of  NMOS  Ml,  Wn  is  the  width 
of  Ml,  and  C is  the  total  output  capacitance.  Similarly,  the  PMOS  pull-up  delay  is 


‘up 


CVm 

‘^l^plpsat 


(5.25) 


where  Ipsat  is  the  saturation  current  per  unit  width  of  PMOS  M8,  and  Wp  is  the 
width  of  M8.  In  this  design,  C mainly  comes  from  the  parasitic  capacitance  of 
NMOS  M6,  Ml  and  M9.  Then, 


C = kWnCo  (5.26) 

where  k is  the  number  of  pixels  in  a row  or  column,  and  Co  is  the  capacitance  per 
unit  width.  Thus,  we  can  simplify  the  switch  delays  as 


kCoVM 

(5.27) 

'^down  r\  r 

^^nsat 

kWnCoVdd 

(5.28) 

OW  T 
Z yy  pipsat 

Note  that  Wn  has  no  effect  on  the  pull-down  delay  provided  that  Wn  is  large  enough 
to  ensure  the  correct  logic.  On  the  contrary,  a large  VP„  shortens  the  pull-up  delay 
due  to  an  increased  load  capacitance.  CADENCE  simulation  results  in  figure  5-11 
have  verified  these  observations.  In  addition,  a small  transistor  width  also  helps  to 
save  layout  area. 

5.2  One  Pixel  Test 

A test  chip  with  one  pixel  was  fabricated  in  TSMC  0.18/rm  digital  CMOS  tech- 
nology. The  chip  has  demonstrated  the  predicted  functionalities.  In  the  following, 
we  will  show  some  experimental  results. 

5.2.1  Light-Intensity-to-Time  Transform 

In  a TTFS  imager,  the  firing  time  is  inversely  proportional  to  the  incident  light 
intensity.  To  measure  this  light-intensity-to-time  transform,  we  use  three  different 
light  sources.  With  different  neutral  density  filters  covering  the  testing  chip,  a wide 
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Figure  5-11:  Pull-down  and  pull-up  delay  for  different  VF„s 


range  of  luminance  is  provided.  The  light  intensities  in  the  testing  environment  are 
obtained  by  using  a digital  light  meter  LX-102.  The  measured  transform  is  shown 
in  figure  5-12,  which  demonstrates  good  linearity  in  the  log  domain  as  we  expected. 
Limited  by  the  available  optics,  the  time  response  of  very  weak  light  intensities  is 
not  included  in  this  measurement.  We  believe  the  firing  time  will  saturate  to  4.5s 
when  the  dark  current  becomes  dominant.  By  putting  the  light  source  very  close 
to  the  chip,  we  obtain  the  shortest  firing  time  about  16/i5,  which  gives  a measured 
dynamic  range  of  109dB.  Since  it  is  impossible  for  us  to  measure  the  actual  light 
intensity  under  this  circumstance,  we  did  not  include  this  data  in  figure  5-12  that 
only  demonstrates  about  90dB  dynamic  range. 

5.2.2  Signal-Swing-to-Time  Response 

As  we  know,  the  firing  time  is  not  only  a function  of  the  incident  light  intensity 
but  also  a function  of  the  signal  swing.  A reduced  signal  swing  leads  to  a decreased 
firing  time  for  the  same  light  intensity.  The  measured  signal-swing-to-time  response 
in  figure  5-13  shows  good  linearity  in  the  large  signal  swing  region.  In  contrast,  the 
reset  residue  degrades  the  response  linearity  in  the  small  signal  swing  region.  After 
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Figure  5-12:  Light-intensity-to-time  transform 


calibrating,  the  linearity  has  been  restored.  The  measured  offset  is  around  50mV, 
which  can  be  further  reduced  by  increasing  the  capacitor  Cmi- 

5.2.3  Noise 

The  conventional  CMOS  imager  noise  measurement  is  performed  in  the  analog 
domain,  which  is  not  suitable  for  time-based  imager  sensors.  Therefore,  we  need  to 
express  the  noise  in  terms  of  firing  time  variance.  Suppose  we  have  a measured  signal 
i,  which  is  expressed  as 


where  / is  the  original  clean  signal  or  expected  value  of  i,  and  A/  is  the  noise  com- 
ponent. Then,  the  actual  firing  time  for  the  measured  signal  i is 


i = I + A1 


(5.29) 


(5.30) 
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(5.31) 
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Figure  5-13:  Signal-swing-to-time  response 


Following  that,  it  yields 


and 


a: 


AP 

~W 


S- 

{ir 


(5.32) 


(5.33) 


where  t = y and  are  the  mean  and  variance  of  the  firing  times  over  a sequence 
of  frames  respectively.  So  we  can  measure  firing  times  to  estimate  analog  noise.  The 
major  limitation  to  the  accuracy  of  the  measurement  is  the  stability  of  the  light 
source.  Limited  by  current  available  optical  instruments,  we  use  a fluorescent  light 
as  the  light  source.  We  first  set  Weset  = 2V  and  We/  = IV.  The  measured  firing 
times  of  300  frames  are  shown  in  figure  5-14.  The  mean  firing  time  is  measured  to  be 
11.04ms,  and  the  standard  deviation  is  52.7/rs,  which  gives  a 46.4dB  SNR.  Another 
measurement  is  accomplished  by  adjusting  We/  to  1.4V.  Since  the  well  capacity  is 
decreased,  we  expect  the  SNR  will  be  somewhat  degraded  as  we  discussed  in  Chapter 
3.  We  used  the  same  light  source  and  measured  300  frames  again.  This  time,  the 
mean  firing  time  changes  to  7.093ms,  which  is  almost  0.6  of  the  previous  measurement 
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as  expected.  The  standard  deviation  is  38.7ps,  which  corresponds  to  a degraded  SNR 
of  45.2dB. 


Figure  5-14;  Temporal  noise  when  Vref  = IV 


5.2.4  Conversion  Gain 

Conversion  gain  characterizes  the  signal  generated  per  photoelectron  and  indi- 
cates the  sensitivity  of  the  sensor.  An  accurate  determination  of  the  conversion  gain 
is  also  helpful  to  determine  a photodiode’s  quantum  efficiency  [43].  The  conversion 
gain  is  defined  by 


V q 
^ Cpd 


(5.34) 


where  v is  the  signal  voltage  at  the  photodiode  and  n is  the  number  of  photoelectrons. 
B.  Beechen  and  E.  Possum  have  proposed  an  excellent  statistical  method  to  determine 
the  conversion  gain  based  on  an  analog  signal  measurement  [43].  For  time-based  image 
sensors,  the  conversion  gain  can  be  determined  using  the  measured  firing  times  by 
assuming  the  shot  noise  is  the  dominant  noise  source  and  obeys  the  Poisson  statistics 
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[4].  Once  the  firing  times  are  measured,  the  conversion  gain  can  be  determined  by 


9 


9 ^ <l_ 

Cpd  ^ 


reset 


Preset  ^ref 
^ref 


N 


(5.35) 


where  N is  the  integrated  electrons,  which  can  be  estimated  by  = ( — 

\(^t 

We  captured  the  firing  times  using  an  Agilent  1693 A Logic  Analyzer.  The  sam- 
pling period  is  set  to  be  5ns,  which  ensures  the  quantization  noise  is  much  lower 
than  other  noise  sources.  We  choose  Vj-ef  = IV  to  guarantee  the  shot  noise  is  the 
dominant  source,  since  the  photocurrent  shot  noise  is  proportional  to  the  signal  swing 
in  time-based  image  sensors.  From  our  previously  measured  results,  we  estimate  the 
conversion  gain  to  be 


1 

^ ~ / 11.04 
\0.0527 

It  also  gives 

C,d=-=  6.96/F  (5.37) 

9 

In  general,  a small  well  capacity  or  capacitance  can  be  traded  off  for  a large  conversion 
gain. 

5.3  Asynchronous  Readout  Design 

A TTFS-classic  imager  is  essentially  a mixed-signal  system  with  an  analog  cir- 
cuitry inside  each  pixel  and  a digital  readout  circuitry  surrounding  the  pixel  array  and 
dealing  with  the  asynchronous  readout  scheme.  Since  the  digital  circuitry  operates 
asynchronously,  the  standard  Verilog  or  VHDL  languages  are  not  applicable  to  design 
it.  Instead,  our  asynchronous  readout  circuitry  is  simulated  by  CADENCE  SpectreS 
while  considering  the  parasitics.  The  digital  asynchronous  readout  circuitry  mainly 
consists  of  a row  arbiter  tree,  a column  arbiter  tree,  a column  latch,  a column  latch 


^ = 23  i^V/ electron  (5.36) 
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control,  a row  interface  and  a throughput  control  block.  We  will  detail  each  block  in 
the  following  sections. 

5.3.1  Arbiter 

The  arbiter  cell  shown  in  figure  5-15  is  the  central  part  of  an  asynchronous 
readout  circuit  and  dealing  with  request  collisions.  An  arbiter  tree  is  built  from  two- 
input  arbiter  cells  using  the  binary  tree  architecture.  The  arbiter  used  in  the  TTFS 
imager  is  not  significantly  different  from  that  of  Boahen  [26]. 


~sel  out  1 


Figure  5-15:  Schematic  of  arbiter  cell 


The  truth  table  of  the  arbiter  cell  is  given  in  table  5-1.  It  should  be  pointed  out 
here  that  the  logic  combination  of  req.outinu  = 0 and  ^seLin  = 0 does  not  exist, 
because  without  any  request  to  the  next  stage,  i.e.,  req-outinu  — 0,  it  is  impossible 
to  generate  a response  signal  ^selAn  = 0 from  the  next  stage.  Whenever  the  current 
arbiter  request  to  the  next  stage  is  approved,  i.e.,  req.outinit  — 1 and  ~seLm  = 0, 
an  output  signal  logic  is  generated  depending  on  the  strength  of  the  corresponding 
input  signal.  If  a collision  occurs,  two  inputs  will  compete  to  ensure  only  one  output, 
either  ~seLouf_l  or  ~se/_ouf_2,  is  activated. 
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Table  5-1:  Truth  table  of  arbiter  cell 


Initial  Condition 

Input 
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5.3.2  Latch  Cell  and  Latch  Control 

To  boost  the  throughput  of  the  asynchronous  readout,  latch  and  latch  control 
circuitries  are  incorporated.  Their  schematics  are  shown  in  figure  5-16  [26].  Note 
that  there  is  one  latch  cell  for  each  column  and  one  single  latch  control  circuit  for  the 
whole  array.  The  latch  cells  and  the  latch  control  block  need  to  cooperate  together 
properly  to  generate  a correct  logic.  They  operate  as  follows: 

1.  The  initial  condition  is  obtained  in  the  reset  period  by  activating  the  control  sig- 

nal and  deactivating  ^ coLrequest{n) , i.e.,  ~rsf=0  and  coLrequest {n)=l. 
Then  we  have  coLreq^arbiter {n)=0,  coLse/(n)=0  (since  coLreq-arbiter{n)=0,  no 
response  comes  from  the  column  arbiter),  /p  = 0,  = 1,  ^rowseLenable=0, 

5=1,  and  latch_data-ready=0. 

2.  Once  a column  request  comes  in,  i.e.,  '^coLrequest{n)  = 0,  it  pulls  up  Ip  and 
sends  out  a request  signal  to  the  column  arbiter  by  setting  col  jreq  Mr  biter  {n)  = 
1,  which  in  turn  pulls  down  Then  = 0 and  Ip  = \ lead  to  5 = 0 and 
latch  all  the  column  requests  by  cutting  off  the  input  path.  Simultaneously, 
'^row-seLenable  = 1 disables  the  pixel  column  request  function  and  guaran- 
tees no  new  column  request  is  generated.  At  the  same  time,  the  active  signal 
latch.datajready  and  row Melect{m)  produce  a control  signal  disable{m),  which 
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together  with  coLreq{n)  disables  the  latched  pixels  in  the  selected  row  (see  fig- 
ure 5-1).  In  addition,  row Mddresslrigger  = 1 makes  the  row  address  encoder 
update  the  output  row  address. 

3.  When  all  the  column  request  signals  are  processed  and  withdrew  after  receiv- 
ing the  feedback  signal  coLsel{n)  — 0,  ^g  returns  to  high.  Since  no  new 
^ col -request  {n)  arrives,  Ip  = 0 brings  all  the  control  signals  back  to  their  ini- 
tial conditions,  and  all  the  latch  cells  are  open  again  for  the  new  upcoming 
coLrequest{n)s. 


~g 


(a) 

Vdd 


(b) 


Figure  5-16:  Schematics  of  latch  cell  and  latch  control,  (a)  latch  cell,  (b)  latch 
control. 
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5.3.3  Row  Interface 

The  row  interface  shown  in  figure  5-17  has  two  functions,  i.e.,  block  unwanted 
rowjrequest{m)  signals  and  decide  whether  a row  can  be  selected  by  the  row  arbiter. 
^isolation  is  the  same  signal  that  we  used  in  the  pixel  design.  During  the  reset 
period,  the  uplink  path  to  the  row  arbiter  is  disabled,  and  row-req -arbiter  is  initially 
set  to  0.  Similarly,  the  downlink  path  from  the  row  arbiter  is  disabled  by  setting 
^row-sel -enable  = 1 once  there  is  an  occupied  latch  cell. 

Vdd 


Figure  5-17:  Schematic  of  row  interface 


5.3.4  Throughput  Control 

Usually,  a communication  channel  between  neuromorphic  chips  using  address- 
event  representation  needs  a request  line  and  an  acknowledge  line.  A typical  arbiterred 
communication  channel  adopts  a handshake  protocol,  i.e.,  a new  transmission  cannot 
start  until  the  acknowledge  signal  for  the  last  transmission  is  received.  A hand- 
shake method  inevitably  increases  the  time-overhead  of  one  arbitration  by  adding 
an  acknowledge  time  period  into  the  communication  cycle.  Though  many  methods 
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have  been  proposed  to  shorten  one  complete  cycle  [44],  it  is  still  too  long  for  our 
TTFS.classic  imager  if  considering  the  reconstruction  errors,  which  will  be  discussed 
in  the  next  chapter.  Since  randomly  lost  pixels  can  be  reconstructed  by  a spatial 
interpolation  method,  we  can  relax  this  tight  requirement  on  communication  channel 
design  by  simply  deleting  the  acknowledge  line.  Instead,  we  designed  a readout  control 
block  shown  in  figure  5-18,  which  is  controlled  by  an  external  clock  to  make  testing 
easier.  Two  non-overlapping  clocks  phase!  and  phase2  control  the  transmission  gates, 
so  at  most  one  transmission  gate  is  open  at  any  time.  The  signal  coLseLarbiter{n) 
comes  from  the  column  arbiter,  the  signal  coLsel{n)  goes  to  the  latch  cell,  and  the 
signal  coLencoderJn{n)  is  the  input  to  the  column  address  encoder. 


coLsel_arbiter(n) 


Figure  5-18:  Schematic  of  throughput  control 


5.4  Layout 

A prototype  TTFS_classic  imager  was  implemented  in  TSMC  0.18/rm  digital 
technology  and  packaged  in  a standard  84-pin  PGA  available  through  MOSIS.  The 
total  die  size  is  5mm  x 5mm.  The  layout  is  shown  in  figure  5-19,  which  has  a 128x128 
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pixel  array  sitting  in  the  center  of  the  whole  chip  surrounded  with  guard  rings.  To 
reduce  the  substrate  coupling,  the  guard  rings  are  shorted  to  a separate  pair  of  power 
supply  and  ground.  In  addition,  analog,  digital  and  pad  circuitries  use  different  power 
supplies  to  avoid  corrupting  the  sensitive  analog  signals,  and  each  of  them  is  formed 
by  multiple  input  pins  to  reduce  the  equivalent  parasitic  inductance.  Also,  to  prevent 
light-induced  currents  from  affecting  the  analog  circuitry  or  causing  latchup,  the  sixth 
layer  metal  is  used  everywhere  except  over  the  photosensitive  node. 
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Figure  5-19:  Layout  of  the  prototype  TTFS_classic 


Numerous  on-chip  bypass  capacitors  are  also  included  to  reduce  SSN.  One  po- 
tential problem  with  on-chip  bypass  capacitors  is  the  resonance  oscillation  in  the 
power  distribution  network  [45].  In  our  design,  we  estimate  the  total  power  sup- 
ply parasitic  inductance  Lp  to  be  less  than  2nH  resulted  from  multiple  power  sup- 
ply pins  and  mutual  inductance.  The  maximum  on-chip  bypass  capacitance  C^y 
is  simulated  to  be  about  3nF.  It  gives  us  the  minimum  resonance  frequency  about 
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'^r  = \ „ ~ 289MHz.  The  clock  we  used  in  the  test  is  below  50Mhz,  which  is 

y zLpCby 

much  less  than  the  minimum  resonance  frequency.  Thus,  the  unexpected  resonance 
oscillation  in  the  power  supply  network  did  not  happen  in  either  our  CADENCE 
simulations  or  real-time  image  capture  test.  If  a much  faster  clock  is  used  in  the  test, 
an  on-chip  parasitic  resistance  needs  to  be  included  in  series  with  the  on-chip  bypass 
capacitor  [45].  In  addition,  we  optimize  the  digital  output  buffer  to  limit  its  driving 
current  to  reduce  the  SSN  amplitude  while  ensuring  its  capacity  to  drive  a standard 
15pf  load  within  2ns.  To  further  reduce  the  SSN,  off-chip  resistors  are  added  in  series 
with  each  address  output  pin. 

5.5  Testing  and  Characterization  of  128x128  Sensor  Array 

A small  printed  circuit  board  (PCB)  shown  in  hgure  5-20  is  built  to  integrate  the 
TTESxlassic  imager,  voltage  regulators,  bias  generators  and  connectors.  A constant 
reference  voltage  is  generated  on  the  PCB  board.  Once  the  timing  information  is 
collected  by  an  Agilent  1693 A logic  analyzer  working  in  the  asynchronous  sampling 
mode,  a simple  MATLAB  program  reconstructs  the  captured  images. 


Figure  5-20;  Experimental  setup 
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(c)  (d) 

Figure  5-21:  A 98dB  dynamic  range  image,  (a)  scale=l.  (b)  scale=16.  (c)  scale=64. 
(d)  scale=128. 


The  high  dynamic  range  test  is  first  setup  with  a one  dollar  bill  taped  over  an 
incandescent  lamp  and  impinged  onto  our  imager.  We  captured  several  high  dynamic 
range  images  by  adjusting  the  height  of  the  object.  A 98dB  dynamic  range  image  is 
shown  in  figure  5-21.  If  we  map  the  brightest  pixel  to  the  maximum  discrete  display 
level,  only  the  center  of  incandescent  bulb  appears  in  left  upper  corner  of  figure  5- 
21(a),  and  the  shape  of  the  lamp  appears  in  the  lower  region  of  the  image.  With  the 
same  captured  image  being  scaled  by  different  levels,  more  details  of  the  dark  region 
show  up.  In  figure  5-21(b),  the  obscure  image  of  the  dollar  bill  is  present,  whereas 
the  details  of  the  dollar  bill  is  clearly  displayed  in  figure  5-21(c).  Note  that  part 
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Figure  5-22:  A 75dB  dynamic  range  image,  (a)  bright  region  details,  (b)  dark  region 
details. 


Figure  5-23:  Gator  logo  1.  (a)  bright  region  details,  (b)  dark  region  details. 


of  the  filament  also  shows  up  in  the  left  upper  corner  of  this  figure.  As  illustrated 
in  figure  5-21(d),  if  the  scaling  factor  is  too  big,  most  of  the  pixels  saturate  and 
are  barely  visible  except  for  the  portrait  of  president  George  Washington.  Another 
high  dynamic  range  image  of  75dB  is  displayed  in  figure  5-22,  which  contains  more 
detailed  information  of  both  the  lamp  holder  and  the  filament.  These  two  images  have 
a 20dB  dynamic  range  difference,  which  is  due  to  the  fact  that  the  darkest  pixel  in  the 
first  capture  is  much  weaker  than  that  in  the  last  one.  The  experimental  results  are 
much  beyond  the  dynamic  range  limitation  of  up-to-date  commercial  CMOS  APSs. 
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Figure  5-24:  Gator  logo  2 


A more  complicated  experimental  setup  is  necessary  in  order  to  get  much  higher 
dynamic  range  images,  which  is  not  feasible  considering  our  current  available  test 
equipments.  Here,  we  also  give  two  other  captured  images  of  two  different  gator 
logos  in  figure  5-23  and  figure  5-24  respectively.  Again,  after  scaling  the  image, 
more  details  in  the  dark  region  show  up  in  figure  5-23(b),  e.g.,  part  of  the  word 
“UNIVERSITY  OF  FLORIDA”  at  the  bottom  and  half  of  the  gator  head  at  the  left. 
Note  that  all  the  displayed  images  here  are  collected  by  a single  capture  and  without 
postprocessing.  In  each  capture,  though  some  missed  firing  pixels  are  present,  they 
can  be  reconstructed  from  a spatial  interpolation  method. 

The  maximum  power  consumption  of  the  whole  system  is  measured  to  be  less 
than  8.7mW  when  the  sensor  operates  at  30frames/sec.  For  the  fixed  pattern  noise 
measurement,  a strict  uniform  light  source  is  required.  Unfortunately  we  do  not  have 
such  optical  instrument,  and  we  used  a typical  fluorescent  lamp  from  the  ceiling  as 
the  uniform  light  source  instead  to  measure  the  firing  times  without  a lens.  In  this 
way,  the  FPN  is  estimated  to  be  about  2.3%. 
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5.6  Summary 

In  this  chapter,  we  have  discussed  the  detailed  circuit  design  of  the  TTFS_classic 
image  sensor.  The  characterization  is  done  at  both  pixel  and  array  levels  in  terms  of 
power  consumption,  light-intensity-to-time  transform,  signal-swing-to-time  response, 
noise,  conversion  gain  and  dynamic  range.  The  demonstrated  test  results  are  very 
promising.  Due  to  the  lack  of  the  necessary  optical  equipment,  the  FPN  is  only 
estimated  in  a typical  room  light  environment.  The  performance  of  the  TTFS.classic 
imager  is  summarized  in  table  5-2.  We  did  not  observe  any  reconstruction  error  due 
to  the  readout  collision  from  our  prototype  TTFS.classic  imager  captured  images. 
However,  with  the  TTFS_classic  scaling  up  to  a large  size  array,  the  reconstruction 
error  might  be  prominent.  We  will  deal  with  this  issue  in  the  next  chapter. 


Table  5-2:  The  performance  of  the  TTFS_classic  imager. 


Technology 

0.18  iim  TSMC  Digital  CMOS 

Supply  Voltage 

3.3V(analog),  1.8V(digital  and  pad) 

Transistors  per  pixel 

24 

Array  size 

128x128 

Pixel  size 

12.4^m  X 12.1/rm 

Photosensitive  area 

2.99/um  X 2.97/rm 

Die  size 

5mm  X 5mm 

Power  dissipation 

8.7mW  max  @30  frames/second 

Dark  current 

<1.04nA/cm^  @room  temperature 

Conversion  Gain 

23  pV/electron 

Dynamic  Range  (single  pixel) 

109dB  (measured) 
>133dB(theory) 

Dynamic  Range  (array) 

98dB  (measured) 
>113dB(theory) 

SNR(measured) 

46.4dB  with  signal  swing=lV 

FPN(estimated) 

2.3% 

Package 

PGA84M 

CHAPTER  6 

MODIFIED  TTFS  IMAGERS 


In  this  chapter,  we  will  investigate  the  potential  readout  errors  for  a large  array 
size  TTFS  imager.  Several  modified  TTFS  imagers  are  proposed  here  to  reduce  the 
readout  errors. 

6.1  Performance  Limits  for  TTFS_classic 

One  unique  issue  with  the  asynchronous  readout  is  the  situation  when  many 
pixels  may  be  under  similar  illumination.  These  pixels  will  send  out  request  signals 
within  a short  period.  If  the  readout  circuit  is  not  fast  enough  to  output  all  the  request 
signals  in  real  time,  some  amount  of  delay  will  be  inevitably  introduced  resulting  in 
some  temporal  errors  for  the  pixel  illumination.  For  each  illumination,  the  relative 
error  can  be  expressed  as 


RE  = ^ ~ = U-T  ^ At 

I tr  T + At 


(6.1) 


where  I and  R are  the  original  photocurrent  and  reconstructed  photocurrent  respec- 
tively, At  = tr  — T is  defined  as  the  time  delay  of  the  output  pulses.  The  above 
equation  indicates  that  for  the  same  amount  of  time  delay  At,  the  error  is  more  se- 
rious for  a high  illuminance  pixel,  where  the  firing  time  T is  very  small.  Also,  this 
problem  is  more  serious  for  larger  size  images,  where  the  probability  that  many  pix- 
els are  firing  in  a short  period  is  higher.  According  to  our  analysis  based  on  TSMC 
0.18/xm  digital  CMOS  process  results,  the  acceptable  shortest  integration  time  is  10/rs 
for  QCIF  (144  x 176)  image  format,  and  it  is  100/rs  for  a 480  x 720  size  image. 

The  readout  time  delay  is  introduced  at  two  places,  i.e.,  the  column  arbitration 
for  reading  out  request  pixels  in  the  same  row,  and  the  time  needed  for  disabling  pix- 
els, row  arbitration  and  latching  column  requests.  Two  obvious  directions  for  readout 
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time  delay  reduction  are  either  to  improve  the  readout  throughput  by  decreasing  the 
circuit  delay  or  to  modify  the  system  architecture  to  reduce  the  collision  possibility. 

To  evaluate  the  severity  of  this  issue,  a MATLAB-based  TTFS  imager  simulator 
was  built.  Critical  timing  information  is  extracted  from  SPICE  simulations  by  con- 
sidering the  worst  case.  The  four  high  dynamic  range  (HDR)  images,  namely  nave, 
groveC,  rosette,  and  vinesunset,  used  in  the  simulation  are  from  Paul  Debevec’s  graph- 
ics research  group  at  the  University  of  Southern  California  [46].  These  four  images  use 
a floating-point  representation  and  have  dynamic  ranges  from  88dB  to  168dB.  The 
subsampled  images  are  slightly  larger  than  QCIF  (144  x 176)  size  images,  which  are 
commonly  used  in  hand-held  devices.  The  statistics  of  these  images  are  summarized 
as  below. 

Table  6-1:  Four  480  x 720  and  four  160  x 180  HDR  images  used  in  MATLAB  simu- 
lation. 


Images  with  a size  of  480  x 720 

Image  Name 

nave 

grove C 

rosette 

vinesunset 

Minimum  Value 

1.69  X 10-^ 

5.53  X 10-4 

2.63  X 10-^ 

1.44  X 10-^ 

Maximum  Value 

4.27  X 10^ 

8.80  X 10^ 

8.28  X 104 

3.61  X 104 

Dynamic  Range(dB) 

168 

124 

130 

88 

Images  with  a size  of  160  x 180 

Image  Name 

nave 

groveC 

rosette 

vinesunset 

Minimum  Value 

1.69  X 10-^ 

9.13  X 10-4 

2.63  X 10-^ 

1.58  X 10-^ 

Maximum  Value 

2.59  X 10^ 

4.93  X 10^ 

7.43  X 104 

3.31  X 104 

Dynamic  Range(dB) 

164 

115 

129 
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Table  6-2:  The  simulated  mean  relative  errors  introduced  by  the  TTFS-classic  imager 
for  video  mode  applications 


Image 

nave 

groveC 

rosette 

vinesunset 

160  X 180 

1.844  X 10-^ 

1.887  X 10-^ 

1.400  X 10-4^ 

1.121  X 10-2 

480  X 720 

1.7772  X 10-^ 

1.90  X 10-^ 

1.8365  X 10-4 

5.9142  X 10-4 

For  the  160  x 180  size  images,  the  simulated  mean  relative  errors  (MREs)  in- 
troduced by  a TTFS_classic  imager  are  listed  in  table  6-2.  These  MREs  correspond 
to  SNRs  of  more  than  40dB  if  time-delay  introduced  errors  are  regarded  as  the  only 
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Figure  6-1:  Images  nave  and  groveC.  (a)  nave  (bright  part),  (b)  nave  (dark  part: 
xlOO  brighter  for  display),  (c)  groveC  (bright  part),  (d)  groveC  (dark  part:  x20 
brighter  for  display). 


noise  component.  Considering  typical  CMOS  APS  imagers  have  a maximum  SNR 
around  40dB,  the  above  errors  are  not  significant.  On  the  contrary,  for  very  large 
size  images,  the  TTFS_classic  imager  cannot  handle  the  large  throughput,  thus  in- 
troducing more  reconstruction  errors  (see  table  6-2).  For  instance,  image  vinesunset 
has  a large  portion  of  pixels  firing  shortly  after  reset,  e.g.,  within  lOOps,  and  its 
reconstruction  error  is  inevitably  higher. 

The  above  numerical  simulation  shows  the  asynchronous  readout  delay  limits 
a large  size  TTFS_classic  imager  dynamic  range  at  the  high  end  of  illuminance  (or 
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(c)  (d) 


Figure  6-2:  Images  rosette  and  vinesunset.  (a)  rosette  (bright  part),  (b)  rosette 
(dark  part:  xlOO  brighter  for  display),  (c)  vinesunset  (bright  part),  (d)  vinesunset 
(dark  part:  xlO  brighter  for  display). 


equivalently  at  the  low  end  of  firing  time).  Thus,  more  efficient  designs  are  necessary 
to  reduce  the  readout  errors. 

6.2  TTFS  with  Rolling  Shutter 

In  the  previous  section,  the  MATLAB  simulation  has  demonstrated  that  the 
readout  delay  error  of  the  TTFS_classic  is  not  significant  for  medium  size  images  with 
dynamic  range  of  more  than  lOOdB.  However,  the  asynchronous  readout  still  limits  the 
achievable  dynamic  range  by  the  shortest  firing  time  (e.g.,  lOOps)  for  large  size  images. 
If  a large  number  of  pixels  fire  shortly  after  global  reset,  the  high  collision  rate  will 
make  the  readout  errors  more  prominent.  To  avoid  a high  collision  rate  for  this  case. 
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we  propose  a modification  to  the  TTFS_classic  architecture  using  a rolling  shutter 
technique.  The  new  architecture  (TTFS-RS)  has  a separate  reset  signal  for  each  row 
instead  of  a single  global  reset.  For  natural  images,  neighboring  pixels  intensities  are 
strongly  correlated  [47],  so  the  firing  times  of  neighboring  rows  tend  to  occur  close 
in  time.  De-correlating  the  firing  times  decreases  the  probability  of  simultaneous 
readout  requests,  therefore  reducing  the  collision  rate.  The  rolling  shutter  spreads 
out  the  firing  times  of  neighboring  rows.  An  alternative  implementation  would  be  to 
place  filters  with  differing  attenuating  factors  over  the  pixel  array.  This  would  also 
decorrelate  neighboring  pixel  values  but  still  use  the  same  global  reset  signal.  The 
attenuating  values  would  have  to  be  known  or  calibrated  for  each  pixel  in  order  to 
reconstruct  the  image. 


rst_row(1) 

At(i)  -H — 

rst_row(2) 


rst_row(i) 

Figure  6-3:  Rolling  shutter  pattern  for  still  mode  applications 

For  still  mode  applications  with  a constant  reference  voltage  scheme,  we  can 
adopt  a typical  rolling  shutter  pattern  shown  in  figure  6-3,  and  there  is  no  specific 
modification  on  the  scene  reconstruction.  For  video  mode  applications,  there  are  two 
alternatives  to  provide  the  reference  voltage,  one  D/A  conversion  per  row  or  one  D/A 
conversion  per  chip.  For  the  former  case,  the  typical  rolling  shutter  pattern  is  still 
feasible,  while  a special  rolling  shutter  pattern  must  be  generated  for  the  latter  case. 
If  the  whole  array  shares  one  D/A  conversion,  the  rolling  shutters  must  have  the 
same  rising  edge  or  starting  time  (see  figure  6-4).  A simple  digital  circuit  shown  in 
figure  6-5  can  be  used  to  generate  this  special  rolling  shutter  pattern.  Every  k rows, 
e.g.,  96,  are  grouped  together  to  share  one  rolling  shutter  pattern.  Inevitably,  this 
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technique  will  add  some  burden  on  reconstruction.  For  row  m,  if  a global  piece-wise 
linear  reference  voltage  pattern  is  applied,  the  time  stamp  To(m)  when  the  reference 
voltage  starts  to  change  is 

To(m)  = To(0)  - (m  mod  k)  x ISdshutter  (6.2) 

where  Atghutter  is  the  unit  shutter  delay.  And  the  corresponding  frame  period  becomes 


Tframei^}  ^/rame(O)  mod  fc)  X ^tshutter  (6-3) 


If  the  time  stamp  for  the  received  pixel(m,  n)  is  trim,  n),  the  reconstructed  pho- 
tocurrent can  be  calculated  by 


Q 


well 


Ir{m,  n)  = < 


Qwell  Tframeij^^  t^{m,,Tl) 


if  tr{m^  n)  < To{m) 


if  tr{m,  n)  > To{m) 


^ t^(^m,n')  Tframei'^)  -^o(^) 
where  n)  = tr{m,  n)  — [m  mod  k)  x Atshutter,  and  Qweii  is  the  well  capacity  given 

by  Qwell  (y^reset  ^ref)  Cpd- 


frame  j 


rst_row(1) 


rst_row(2) 


1 • • • • • 

— At(i) 

• • • • • • 

rst_row(i) 

Figure  6-4:  Rolling  shutter  pattern  for  video  mode  applications 


No  matter  what  rolling  shutter  pattern  is  used,  captured  images  will  demonstrate 
scattered  firing  times.  Here  we  take  image  vinesunset  as  an  example  to  show  how 
efficient  the  rolling  shutter  can  scatter  the  firing  times.  Compared  to  that  for  a global 
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Figure  6-5:  Rolling  shutter  generator  for  video  mode  applications 

reset  in  figure  6-6,  the  firing  time  histogram  for  the  rolling  shutter  reset  demonstrates 
a large  variation  as  expected. 


Figure  6-6:  Simulated  firing  time  histogram  of  vinesunset  for  the  TTFS_classic  imager 

To  gauge  the  quality  of  a TTFS_RS  imager,  we  investigate  MREs  for  those 
HDR  images  with  480  x 720  array  size.  The  simulated  MREs  listed  in  table  6-3 
demonstrate  that  the  TTES_RS  imager  successfully  decreases  the  reconstruction  error 
by  spreading  out  the  firing  times.  For  still  mode  applications,  a longer  shutter  delay  is 
preferred.  However,  the  rolling  shutter  delays  between  pixel  resets  must  be  kept  short 
in  dynamic  scenes  to  prevent  motion  artifacts.  To  help  visualization,  we  include  the 
reconstructed  vinesunset  images  in  figure  6-8.  Obviously,  the  reconstructed  image  by 
the  TTFS-classic  has  more  prominent  errors  in  the  bright  part,  whereas  the  TTFS.RS 
has  successfully  reduced  the  magnitude  of  the  errors  and  no  visible  artifacts  remain 
in  the  reconstructed  image. 
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Figure  6~7:  Simulated  firing  time  histogram  of  vinesunset  for  the  TTFS-RS  imager. 

Table  6-3:  The  simulated  mean  relative  errors  introduced  by  the  TTFS-RS  imager 
for  video  mode  applications 


Image 

nave 

groveC 

rosette 

vinesunset 

delay  = Ops 

1.7772  X 10-^ 

1.90  X 10-^ 

1.8365  X 10-4 

5.9142  X 10-4 

delay  = 20ps 

4.1360  X 10-® 

2.0892  X 10-^ 

4.3259  X 10-^ 

6.6035  X 10-^ 

delay  = 35ps 

3.5200  X 10-^ 

1.7239  X 10-^ 

5.3687  X 10-^ 

1.2677  X 10-^ 

delay  = 50ps 

3.7302  X 10“^ 

1.0456  X 10-^ 

6.8610  X 10“^ 

4.2494  X 10-^ 

6.3  Hybrid  TTFS  Imager 

Biological  vision  and  electronic  image  acquisition  share  some  common  princi- 
ples, and  local  memory  is  one  of  the  common  features  [48].  A digital  pixel  sensor 
(DPS)  with  local  memory  has  been  designed  by  the  Smart  Image  Sensor  Group  at 
Stanford  University  [49],  which  shows  the  feasibility  of  implementation  of  local  mem- 
ory in  silicon.  The  potential  high  collision  rate  of  asynchronous  readout  makes  the 
TTFS-classic  unfavorable  for  a very  large  size  array.  However,  this  problem  can  be 
avoided  by  incorporating  local  memory.  To  achieve  a fine  A/D  conversion  for  a wide 
dynamic  range  of  120dB,  at  least  20  bits  of  local  memory  are  needed  for  conven- 
tional time-based  methods,  which  will  dramatically  increase  the  pixel  layout  area 
and  degrade  the  sensor’s  spatial  resolution  for  a typical  CMOS  process.  The  thin 
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Figure  6-8:  Image  vinesunset.  (a)  bright  part  of  the  reconstructed  image  by  the 
TTFS_classic.  (b)  dark  part  of  the  reconstructed  image  by  the  TTFS.classic.  (c) 
bright  part  of  the  reconstructed  image  by  the  TTFS_RS.  (d)  dark  part  of  the  recon- 
structed image  by  the  TTFS-RS. 


film  on  ASIC  (TFA)  technology  is  probably  the  best  means  to  implement  this  archi- 
tecture, since  it  vertically  integrates  an  amorphous  silicon  detector  and  a crystalline 
application-specific  circuit  for  pixel  readout  and  optional  signal  processing  [50].  How- 
ever, introducing  special  layers  into  a conventional  process  will  tremendously  increase 
the  cost  of  fabrication.  Therefore,  for  an  image  sensor  fabricated  in  a conventional 
CMOS  process,  we  have  to  decrease  the  local  memory  area  to  favor  the  spatial  reso- 
lution. Luo  [31]  has  proposed  a two-degree-of- freedom  quantization  technique,  which 


96 


can  use  8 bits  to  represent  over  90dB  high  dynamic  range  images.  It  is  in  fact  a 
pseudo-log  compression  of  wide  dynamic  range  image  with  sacrificing  the  quantiza- 
tion noise.  However,  it  greatly  improves  the  possibility  of  using  local  memory  for 
time-based  image  sensors.  Here,  we  adopt  this  two-degree-of-freedom  quantization 
principle  and  propose  a novel  TTFS  imager  with  local  memory,  called  TTFS-LM. 
6.3.1  Principle  of  TTFS_LM 

The  pixel  schematic  of  this  novel  sensor  is  shown  in  figure  6-9,  which  has  a 
photodiode,  a comparator,  a level-down  shifter  and  a 9-bit  3T-DRAM.  Unlike  Stan- 
ford’s DPS,  this  novel  image  sensor  works  in  a continuous  mode  to  sense  the  input 
transition,  thus  a photodiode  instead  of  a photogate  is  used.  Note  that  the  positive 
and  negative  signs  in  the  opamp/comparator  block  denote  the  inputs  only  for  the 
opamp,  and  the  signs  should  be  switched  in  the  comparator  schematic  (see  figure 
6-11  for  details).  We  apply  the  varying  reference  voltage  to  the  positive  input  of  the 
opamp  and  generate  the  digital  value  in  an  external  counter.  By  carefully  adjusting 
the  reference  voltage,  we  could  implement  uniform  or  nonuniform  quantization.  The 
whole  circuit  has  two  phases,  time-to-first-spike  (or  saturation  sensing)  phase  and 
conventional  fine  A/D  conversion  phase.  The  transistors  labelled  with  * are  thick 
oxide  3.3V  transistors,  which  are  used  to  achieve  a large  input  signal  swing  and  a low 
leakage  current. 


Figure  6-9:  Pixel  schematic  of  the  TTFS-LM  imager 
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During  the  time-to- first-spike  period  or  phase  1,  the  photocurrent  discharges  the 
photodiode  and  Vin  continuously  drops  down  until  it  reaches  a constant  reference 
voltage  Vref  = Vmin-  Then  the  comparator  flips,  VI  goes  to  high,  and  V2  goes  to  low. 
Simultaneously  the  digital  value  generated  by  the  external  counter  is  latched  in  the 
local  memory.  The  most  significant  bit  (MSB)  indicates  that  the  digital  value  latched 
here  is  for  phase  1.  The  digital  value  can  be  generated  in  terms  of  uniform  or  nonuni- 
form quantization.  If  using  uniform  quantization,  the  achievable  dynamic  range  is 
only  6x9  = 54dB,  while  for  nonuniform  quantization  (such  as  log  compression),  the 
dynamic  range  can  be  enhanced  over  lOOdB. 

During  the  fine  A/D  conversion  period,  i.e.,  phase  2,  the  remaining  voltage  of  the 
photodiode  will  be  firstly  stored  on  Ml  by  closing  the  shutter.  At  this  moment,  we 
apply  a ramp  voltage  to  V)-e/-  The  beginning  ramp  voltage  is  lower  than  the  expected 
lowest  voltage  at  the  sense  node,  which  causes  D2  to  raise  and  enables  the  memory  to 
load  the  gray  code  values  generated  by  the  external  counter.  Next,  the  ramp  voltage 
linearly  increases  until  beyond  the  reset  voltage.  When  the  ramp  exceeds  the  sense 
node  voltage,  V2  goes  low,  and  the  pixel  memory  latches  the  corresponding  gray 
code.  The  8-bit  gray  code  is  stored  in  the  LSBs  of  the  local  memory,  while  MSB=0 
represents  the  fine  A/D  conversion. 

Figure  6-10  illustrates  the  principle  of  this  novel  TTFS-LM  imager.  Based 
on  photocurrent  amplitudes,  the  memory  latches  different  digital  outputs,  a log- 
compression  code  or  gray  code  for  the  single  slope  A/D  conversion.  Of  course,  we  can 
replace  the  log-compress  code  with  other  codes,  e.g.,  a uniform  quantization  code. 
It  clearly  shows  that  the  local  memory  latches  only  once  for  each  capture  or  frame. 
No  readout  delay  or  confusion  generates,  hence  the  TTFS-LM  solves  the  potential 
collision  problem  with  the  TTFS_classic. 

In  summary,  this  novel  design  has  the  following  advantages; 
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Figure  6-10:  Principles  of  the  TTFS_LM  imager 


1.  Be  able  to  decide  the  optimal  shuttering  time.  The  first  2®  levels  can  be  ded- 
icated to  obtain  a rough  statistics  of  pixel  firing  times  and  decide  the  optimal 
shuttering  time. 

2.  Extend  the  dynamic  range  for  each  capture  or  frame.  How  much  dynamic  range 
extended  depends  on  the  quantization  method  adopted  in  the  phase  1. 

3.  Be  able  to  realize  a multiple  sampling  technique,  which  can  be  implemented  by 
assigning  individual  capture  times  or  shuttering  times. 

4.  Be  able  to  capture  a high  dynamic  range  scene  with  fewer  captures  than  a 
conventional  multiple  sampling  technique. 

5.  Have  a simpler  Vj-ef  pattern  than  that  for  Luo’s  method. 

6.  Enjoy  better  SNR,  which  will  be  discussed  in  section  6.3.4. 

6.3.2  Pixel  Design 

Since  both  nonuniform  and  uniform  quantization  can  be  used  in  this  system,  the 
digital  CDS  technique  [49]  is  not  applicable  here.  Instead,  we  use  an  autozeroing 
technique  to  reset  the  photodiode  and  reduce  the  offset  FPN  simultaneously.  Thus 
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the  comparator  is  required  to  work  also  as  an  opamp.  To  achieve  a small  layout  area, 
a relative  high  gain  and  a large  bandwidth,  we  use  a two  stage  comparator  topology 
shown  in  figure  6-11,  where  the  first  stage  is  a conventional  5-transistor  opamp,  and 
the  second  stage  is  made  up  of  N3  and  P3.  All  these  transistors  are  thick  oxide 
transistors.  The  followings  are  the  design  criteria  for  this  opamp/comparator: 


1.  Assume  the  maximum  ADC  resolution  to  be  m=8  bits  and  the  trigger  point 
for  the  following  level-down  shifter  is  IV.  If  the  input  voltage  is  2V,  thus  the 
comparator  gain  is  required  to  be  at  least  1/2x2®  = 128. 

2.  Suppose  the  total  A/D  conversion  time  is  500/rs.  Then  each  comparison  time  is 
500/^s/256  = 1. 95/US,  which  means  the  comparator  bandwidth  must  be  at  least 


3.  If  the  maximum  comparator  introduced  delay  is  required  to  be  less  than  2/us, 
then  the  lower  bound  of  opamp  bandwidth  is  79.5  kHz,  which  is  less  than  81.5 
kHz. 

4.  Minimize  power  consumption. 


Opamp  Out 


Figure  6-11:  Comparator/opamp  in  the  TTFS_LM 


81.5  kHz. 


Table  6-4  lists  the  performance  of  this  topology  from  CADENCE  simulations,  which 
has  satisfied  all  the  criteria.  The  whole  circuit  operates  in  the  weak  inversion  region 
to  save  power  consumption  and  achieve  a high  comparator/opamp  DC  gain. 
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Table  6-4:  Performance  of  the  proposed  opamp/comparator 


Opamp  /Comparator  gain 

44dB/75.8dB 

Opamp/comparator  f-zdB 

114kHz 

Phase  margin 

79° 

Total  current 

300nA 

Figure  6-12:  Achievable  bit  resolution  vs.  frame  rate 


The  maximum  achievable  ADC  resolution  is  limited  by  the  opamp  bandwidth. 
There  are  two  ways  to  improve  it.  One  way  is  to  increase  the  bias  current,  thus 
improving  the  bandwidth.  Of  course,  this  will  increase  the  power  consumption,  si- 
multaneously decrease  the  opamp  gain  and  affect  the  FPN  reduction.  The  other  way 
is  to  lengthen  the  A/D  conversion  period,  which  would  lower  the  frame  rate.  Suppose 
the  maximum  DRAM  holding  time  is  10ms  and  the  fine  A/D  conversion  spends  half 
of  the  holding  time.  We  plotted  the  simulated  achievable  ADC  resolution  in  figure 
6-12,  which  shows  that  a higher  frame  rate  leads  to  a lower  ADC  resolution.  As 
discussed,  both  the  comparator  gain  and  the  maximum  DRAM  holding  time  limit 
the  upper  bound  of  achievable  ADC  resolution.  The  3T-DRAM  is  shown  in  figure 
6-13.  A typical  design  has  achieved  a maximum  data  hold  time  of  10ms  [49],  which 
gives  an  ADC  resolution  of  m = [(lOms/2)  • 27t  • /-ads]  = 10.  Overall,  there  are  38 
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transistors  in  each  pixel.  We  expect  each  pixel  will  occupy  a 12//m  x \2jmi  area  and 
the  fill  factor  is  around  15%  for  a typical  O.lS^m  digital  CMOS  technology. 


bit  i 


Figure  6-13:  Schematic  of  3T-DRAM  in  the  TTFS_LM 


6.3.3  Clock  Pattern  for  Uniform  Quantization  in  Phase  1 

To  achieve  the  uniform  quantization  of  photocurrent,  a special  clock  pattern  is 
required.  Suppose  the  shutter  is  closed  at  Tq,  and  in  the  following  period,  the  single 
slope  A/D  conversion  is  performed.  Thus,  the  maximum  nonsaturated  photocurrent 
in  phase  1 is 


/o  = 


Qwell 


(6.4) 


where  Qweii  is  the  available  well  capacity.  Assume  a 9-bit  pixel  level  DRAM  is  inte- 
grated, then  the  photocurrent  resolution  is 


A/  = 


Qwell 
To -28 


(6,5) 


In  order  to  achieve  the  same  quantization  resolution  in  the  saturation  sensing  period 
with  a fixed  reference  voltage,  we  have  to  adjust  the  timing  for  the  external  ramp 
generation.  For  /*,,  A:  = 1,  • • • , 256,  the  individual  timing  would  be 


tk 


Qwell 

lo  + kAI 


To 

1 + fc/256 


(6.6) 
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This  expression  not  only  gives  us  the  specific  timing  information  but  also  provides 
us  the  speed  requirement  for  an  external  clock  to  achieve  enough  A/D  conversion 
accuracy. 

6.3.4  Novel  Multiple  Sampling  Technique 

A conventional  multiple  sampling  scheme  is  to  sample  the  pixel  output  at  ex- 
ponentially increasing  exposure  times,  T,  2T,  • • • , 2^T.  Since  the  TTFS_LM  pixel  is 
able  to  enhance  each  capture  dynamic  range  by  a factor  of  2 if  uniform  quantization 
is  used  in  phase  1,  the  exposure  time  can  be  set  to  be  exponentially  increased  by  a 
factor  of  4.  It  is  desirable  to  achieve  good  image  quality  with  fewer  captures,  since 
fewer  captures  reduce  imaging  system  computational  power  and  image  sensor  readout 
power  consumption  [38].  For  this  novel  exposure  timing,  the  SNR  can  be  expressed 
as 


{ipht  nt) 

Qi^ph  T T 


Qwell 


Qwelt  ' Q d" 


SNRilpf,)  = < 


k\2 


{iphUnt/^  ) 


Qwell 


^ Qwell  ■ Q T CT^ 


0 — ^ph  — ^0 


^0  ‘^d  — ^ph  — 24q  ^d 


2^^^  Qq  - id  < iph  < 2^^/o  - id 


2'^^Iq  -id<  iph  < 2^'"+^/o  - id 


where  Iq  = 


Qwell 


(6.7) 


t 


, Qwell  is  the  well  capacity,  and  tint  is  the  longest  integration  time. 


int 


The  simulated  SNR  is  plotted  in  figure  6-14.  Compared  with  a typical  multiple 
sampling  technique,  this  novel  multiple  sampling  enjoys  better  SNRs,  since  most 
photocurrents  are  able  to  achieve  their  full  well  capacity  (see  figure  6-15). 


To  save  power,  a conventional  multiple  sampling  technique  could  also  increase  the 
exponential  capture  time  by  a factor  of  4.  If  so,  however,  the  SNR  will  dramatically 
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Figure  6-14:  SNR  of  the  multiple  sampling  technique  for  the  TTFS-LM 

be  degraded.  From  figure  6-16,  we  observe  numerous  6dB  dips  in  the  middle,  which 
is  much  worse  than  that  of  the  novel  multiple  sampling  technique. 

Of  course,  this  good  performance  is  achieved  under  the  assumption  that  the 
integration  time  can  be  set  arbitrarily  long.  This  assumption,  however,  does  not 
always  hold  for  DRAMs.  The  maximum  holding  time  of  DRAM  sets  the  longest 
achievable  integration  time,  i.e.,  approximately  T^t  = Thoid-  If  a SRAM  or  DRAM 
with  longer  holding  time  is  used  as  local  memory,  this  performance  limit  will  be 
overcome. 

6.4  Contrast  Mode  TTFS  (TTFS_CM) 

As  noted  previously,  natural  scenes  have  amplitude  spectra  that  fall  inversely 
with  frequency,  roughly  1/f  [51].  Correspondingly  in  biological  vision  systems,  the 
neurons  are  more  sensitive  to  the  local  contrast  rather  than  the  absolute  illuminance 
value.  High  contrast  is  read  out  ahead  of  low  contrast,  which  achieves  some  limited 
compression.  We  believe  this  characteristic  is  helpful  to  reduce  the  collision  rate  when 
uniform  illuminance  is  impinged  on  a sensor.  In  addition,  a contrast-mode  image  is 
very  useful  in  the  early  vision  computation. 
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Figure  6-15:  Comparison  of  SNRs  when  the  exposure  time  in  the  conventional  mul- 
tiple sampling  technique  is  increased  by  a factor  of  2. 


6.4.1  Principles  of  Current-mode  Diffusor 

To  compute  a local  spatial  average,  a resistive  network  is  required  [20].  Con- 
ventional methods  of  implementing  a resistor  in  VLSI  technology  include  using  a 
MOSFET  or  a complex  transconductance  amplifier.  Both  methods  work  in  the  volt- 
age mode  and  suffer  from  a limited  range  of  voltages  to  demonstrate  linear  resistance 
[52].  To  circumvent  this  limitation,  a current  mode  implementation  of  resistive  net- 
works is  proposed  [53].  Boahen  [54]  uses  the  concept  of  current  diffusor  illustrated  in 
figure  6-17  to  explicitly  explain  the  principle  of  this  novel  transistor  network. 

Assuming  that  all  the  PMOS  transistors  are  identical,  the  transistors  connecting 
to  Vr  are  in  the  linear  region,  and  the  remaining  transistors  work  in  the  saturation 
region,  then  we  have 


^.VH.d-Vr;)+Vth  ( Vdd-Vj  Vdd-Vj+i 


IqC 


^ (Vdd-yg)+Vth 
= loe 


e — e 


loutj 

Vdd-VR+Vth 


^ Vdd-yR+yth 
IqB 


g mU'i 


(6.8) 
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Figure  6-16:  Comparison  of  SNRs  when  the  exposure  time  in  the  conventional  mul- 
tiple sampling  technique  is  also  exponentially  scaled  by  a factor  of  4. 

Applying  KCL  at  node  we  obtain 


‘outi 


Vft-Vn 

g mUj<  loutj+i) 


(6.9) 


cP 

where  ~ 2/out  + hut,+i)  is  the  discrete  approximation  of  the  operator. 

dx^ 


Thus,  the  output  current  can  be  modelled  as  the  sum  of  the  input  current  and  the 


local  spatial  contrast,  which  is 


^outj  lirij  T Icontrastj 


(6.10) 


where  Icontrastj  = 6 {loutj-i  ~ ‘^Putj  + /outj+i)-  Hence,  the  local  contrast  informa- 
tion can  be  achieved  after  subtracting  the  input  current  from  the  output  current. 

6.4.2  TTFS_CM  Pixel  Design 

By  applying  a resistive  network,  several  silicon  retinas  have  been  designed  to 
extract  the  local  contrast  information.  In  [55]  , an  I-V  converter  is  realized  by  feed- 
ing the  contrast  current  through  two  diode-connected  MOSFETs.  Since  the  readout 
signal  is  still  an  analog  signal,  it  is  very  sensitive  to  the  readout  noise.  In  contrast. 
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temporal  information  is  believed  to  be  more  robust  in  the  noisy  environment.  Boa- 
hen  [56]  and  Barbara  [57]  both  use  an  asynchronous  readout  circuitry  to  output  the 
temporal  information  instead  of  analog  values.  Like  other  time-based  CMOS  image 
sensors,  Boahen’s  retina  also  adopts  the  Pulse  Frequency  Modulation  (PFM)  scheme, 
and  too  many  pulses  need  to  be  output  for  the  purpose  of  good  reconstruction.  Thus 
the  inevitable  high  power  consumption  makes  it  unfavorable.  In  [56],  the  gradient 
information  is  obtained  by  modulating  the  local  contrast  current  with  a predefined 
sine  or  cosine  wave,  which  increases  the  complexity  of  the  front-end  circuitry  de- 
sign. To  circumvent  these  disadvantages,  we  proposed  a new  contrast-mode  imager 
(TTFS-CM),  which  still  follows  the  time-to-first-spike  readout  scheme.  The  new  pixel 
schematic  is  shown  in  figure  6-18  and  works  as  follows: 

1.  The  node  X is  initially  reset  to  Keseo  a middle  point  of  the  defined  range 
iVrefJow,  Vref-high),  by  turning  on  transistor  M9. 

2.  After  the  ~rst  goes  high,  a contrast  current  Icontrastj  is  generated  through  the 
current  diffusor  and  current  mirrors  formed  by  M2-M8.  This  generated  contrast 
photocurrent  either  charges  or  discharges  node  X.  When  the  voltage  leaves  the 
predefined  range,  one  of  the  comparators  output  node  flips,  and  the  node  req 
goes  high.  Then  the  pixel  sends  out  a request  to  the  row  arbiter  by  pulling 
down  '^rowjrequest{m). 
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Figure  6-18:  Pixel  schematic  of  the  TTFS-CM 


3.  If  the  row  arbiter  selects  this  row  by  making  '^row select {m)  high,  column 
request  signals  ^coljrequest{n)  and  sign  labels  ^signsequest{n)  will  be  sent 
to  the  column  latch. 

4.  After  ^coLrequest{n)s  are  latched,  the  corresponding  control  signals  disable{m) 
and  coLreq{n)  are  generated  to  disable  the  pixel  by  switching  on  transistor  M12 
and  Ml 3.  Thus  the  pixel  will  not  fire  again  until  the  next  reset  phase  turns  off 
M12  and  Ml  3. 

Note  that  MIO  and  M15  are  included  in  the  circuitry  to  ensure  the  valid  logic 
during  the  reset  period.  The  comparator  can  still  be  a simple  5-transistor  opamp. 
6.4.3  Simulation  Results  and  Discussion 

The  proposed  circuit  was  investigated  using  the  SpectreS  simulator.  After  a step 
signal  is  fed  into  a 1-D  array,  the  output  signals  are  measured  in  terms  of  firing  times 
and  then  converted  back  into  currents. 

Two  different  step  signals  are  considered  here,  and  the  corresponding  simulation 
results  are  shown  in  figure  6-19  and  figure  6-20  respectively.  From  the  results,  we 
observe  that  the  current  diffusor  network  successfully  extracts  the  edge  information. 
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Figure  6-19:  Simulation  results  of  the  TTFS-CM  when  input  currents  are  lOOnA  and 
IfiA. 


which  is  sparsely  distributed  and  may  lead  to  a low  collision  rate  for  the  asynchronous 
readout  scheme.  We  also  find  that  the  input  signal  absolute  amplitude  has  an  impact 
on  the  output  signal  value,  which  could  be  explained  as  the  result  of  the  different 
initial  condition  for  the  differential  equation  6.9.  Different  Vq  values  are  also  exploited 
with  a fixed  Vr  = 3.4V.  It  shows  that  when  Vr  — Vg  goes  positive,  the  response  output 
signal  is  somewhat  increased  and  extended  widely.  As  in  [52],  we  can  use  diffusion 


length  X — e to  explain  this  phenomenon  here.  That  is  with  Vr  — Vq  goes  large, 
A becomes  much  significant. 

The  TTFS-CM  actually  works  in  the  current  mode.  In  the  previous  simulation, 
we  didn’t  consider  current  mirrors  mismatch.  If  including  this  effect,  the  results  would 
be  a little  “noisy”.  As  we  derived  in  Appendix  B,  a current  mirror  working  in  the 
subthreshold  region  has  a severe  mismatch  problem.  Since  most  photocurrents  are 
very  weak,  we  need  to  size  the  MOSFETs  properly  to  push  them  towards  working  in 


109 


Figure  6-20:  Simulation  results  of  the  TTFS-CM  when  input  currents  are  lOnA  and 
lOOn^. 

the  above-threshold  region  instead  of  the  subthreshold  region  to  lessen  the  “noisy” 
mismatch.  Considering  these  factors,  the  pixel  layout  is  expected  to  occupy  an  80  x 
80/im  area  in  a O.Gfim  technology. 

6.5  Summary  and  Discussion 

In  this  chapter,  several  modified  TTFS  architectures  were  introduced  to  solve 
the  potential  high  collision  rate  problem  associated  with  a large  size  TTFSxlassic 
imager.  Simulation  results  have  demonstrated  the  expected  performance.  Since 
the  TTFS-RS  and  the  TTFSmlassic  actually  share  the  same  idea  (a  time-based  im- 
ager with  asynchronous  readout),  and  the  TTFS_RS  is  just  a minor  modification  of 
the  TTFS-classic,  the  TTFS_classic  can  be  easily  extended  to  design  the  TTFS-RS. 
The  TTFS-CM  might  be  useful  in  the  early  vision  computation.  Edge  information 
is  extracted  and  output  using  the  same  asynchronous  readout  circuitry  as  in  the 
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TTFS-classic.  Once  the  pixel  design  is  optimized,  the  TTFS-CM  can  be  easily  inte- 
grated into  the  existing  TTFS_classic  structure.  In  addition,  the  complex  design  of 
the  TTFS_LM  is  beyond  the  scope  of  this  dissertation.  Therefore,  the  TTFS.RS,  the 
TTFS_CM  and  the  TTFS_LM  are  being  skipped. 


CHAPTER  7 

CONCLUSIONS  AND  FUTURE  WORK 


In  this  dissertation,  the  fundamentals  of  photodectectors  were  reviewed.  A de- 
tailed analysis  of  conventional  CMOS  image  sensors  reveals  their  DR  limitation.  To 
deal  with  high  dynamic  range  scenes,  a novel  time-to-first-spike  (TTFS)  CMOS  image 
sensor  is  presented  to  circumvent  the  DR  limitation.  The  advantages  of  the  TTFS 
imager  include  high  DR,  better  SNR,  feasibility  for  video  mode  applications  and  low 
power  consumption.  A systematic  study  of  an  optimal  strategy  for  reference  voltage 
variation  in  TTFS  imagers  is  also  included.  The  TTFS_classic  imager  is  believed  to 
be  a strong  competitor  among  high  dynamic  range  CMOS  image  sensors,  especially 
for  the  small  to  moderate  size  imager  market.  The  potentially  high  collision  rate, 
however,  limits  the  TTFS_classic  imager  from  scaling  up  to  a large  size.  Several 
modified  TTFS  architectures  are  proposed  to  solve  this  potentially  high  collision  rate 
problem. 

Due  to  some  practical  considerations,  a TTFSmlassic  image  was  implemented 
using  TSMC  0.18/rm  digital  CMOS  technology  in  this  work.  The  prototype  chip 
demonstrates  the  expected  performance,  which  is  very  exciting  and  promising.  To 
continue  this  work,  the  author  believes  the  following  directions  need  to  be  considered: 

1.  The  major  remaining  problem  is  that  of  nonuniform  motion  blur.  This  is  a seri- 
ous concern  for  all  time-based  imagers,  particularly  in  terms  of  human  subjec- 
tive evaluation  of  the  resulting  images.  Fundamentally,  brighter  pixels  blur  less 
than  darker  pixels  since  brighter  pixels  are  scanned  off  first.  A similar  problem 
occurs  with  the  multiple  sampling  techniques,  however  researchers  have  already 
begun  to  address  post-processing  methods  to  correct  the  nonuniform  motion 
blur  [37].  We  expect  that  similar  post-processing  techniques  will  be  developed 
for  TTFS  imagers. 
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2.  As  we  discussed,  the  TTFS-LM  seems  to  be  the  best  choice  to  solve  the  collision 
problem.  Either  a SRAM  or  DRAM  must  be  integrated  into  each  pixel.  A small 
array  of  the  TTFS_LM  could  be  a good  example  to  prove  its  functionality. 

3.  An  image  taken  by  the  TTFS-CM  imager  is  a contrast  mode  image.  A con- 
version algorithm  is  demanded  to  convert  a contrast  image  back  to  its  original 
image.  Currently,  our  TTFS-CM  works  in  the  current  mode,  whose  perfor- 
mance is  degraded  by  current  mirrors  mismatch.  More  practical  circuit  design 
considerations  are  needed  to  get  a less  “noisy”  image. 

4.  Generally,  the  sensor  response  of  a time-based  imager  is  nonlinear.  So  far,  all 
the  implemented  time-based  image  sensors  are  monochromatic  sensors,  and  no 
color  processing  is  involved.  It  is  believed  that  a linear  sensor  response  is  much 
beneficial  to  color  processing.  Thus,  to  find  a good  method  dealing  with  color 
images  for  time-based  image  sensors  is  another  future  research  direction. 


APPENDIX  A 

DC  OFFSET  FOR  AN  OPAMP  IN  THE  WEAK  INVERSION  REGION 
For  a differential  pair  in  the  strong  inversion  region,  the  DC  offset  voltage  is 
shown  in  [58]  to  be 


^OS,in  = 
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It  reveals  the  dependence  of  Vos,m  on  device  mismatches  and  bias  conditions.  For  an 
opamp  operating  in  the  weak  inversion  region,  its  DC  offset  demonstrates  a similar 
dependence.  However,  the  explicit  DC  offset  expression  is  unique  due  to  the  different 
current-voltage  characteristic.  Assume  that  both  the  input  transistors  and  the  load 
resistors  in  figure  A-1  suffer  from  mismatch,  i.e.,  Vth\  = Vt/i,  Vth2  = ^th  + 
{W/L)i  = {W/L),  {W/L)2  = {W/L)  + A{W/L),  and  Idi  = Id,  Id2  = Id  + Alp.  For 
simplicity,  other  mismatches  are  neglected.  If  all  the  transistors  operate  in  the  weak 
inversion  region  and  saturate,  and  Vos,in  = Vgsi  ~ we  have 


VoSAn  = Ut  In 
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D1 


= Ut  In 


h{W/L), 
Id 


+ K/ii  — Dr  In 


In 


iD2 


Is{W/L)2 
{W/L)  + A{W/LY^ 


VtH2j 

-AVth 


(A.2) 


Id  + AId  {W/L) 

where  Is  = iieffCox{'rfi  — 1)D|..  Assuming  AId/ Id  ^ 1 and  A{W/L)/{W/L)  <C  1, 
and  noting  that  for  x <C  1 we  can  approximate  ln(l  -H  x)  ~ x,  we  can  reduce  the 
above  equation  to 


Vos, in  = Ut 
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- AK 
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Compared  with  equation  A.l,  the  offset  for  a differential  pair  in  weak  inversion  is 
much  less  than  that  in  strong  inversion  due  to  the  fact  that  Ut  is  typically  less  than 
{Ygs  ~ at  room  temperature. 


Figure  A-1:  A differential  pair  with  offset  referred  to  the  input 


A typical  5-transistor  opamp  shown  in  figure  A-2  is  used  in  our  design.  Suppose 
all  the  transistors  are  biased  in  the  weak  inversion  region  and  saturated.  For  the 
NMOS  pair,  Vqs  = 0,  then  we  have 


Mn  ,,  ^VtKN  ^{W!L)n 
In  ^ Ut  {W/L)^ 

With  — - — = — — , we  obtain 
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Figure  A-2:  5-transistor  differential  pair  with  offset  referred  to  the  input 


Since  mismatches  are  assumed  to  be  independent  statistical  variables,^  we  can  express 
the  standard  deviation  of  the  offset  as 


1/2 

^OS,in 


= AVIp  + AK 


(A{W/L)%  ^ A{W/L) 


th’N 


(A.6) 


m (W/Lfp  / 

Here  we  consider  the  worst  case  and  simply  regard  the  maximum  threshold  volt- 
age difference  between  different  corners  as  the  threshold  voltage  standard  deviation, 
i.e.,  AVth,p  ~ AVth,N  ~ O.lf/  for  TSMC  O.lSptm  digital  CMOS  technology  [59], 
which  means  the  mismatch  contribution  from  AVth  is  more  important  than  that  from 
A{W/L).  Then  the  offset  standard  deviation  will  be  Vosin  — 2 x (0.11/)^.  If  we 
intend  to  reduce  the  offset  less  than  1.5mV  via  an  autozeroing  technique,  a finite 
opamp  gain  of  100  is  required. 


^ We  neglect  the  AVth  dependence  on  W here  for  simplicity 


APPENDIX  B 

CURRENT  MIRROR  MISMATCH  BEHAVIOR 
Special  care  needs  to  be  taken  to  design  current  mirrors  with  small  mismatch. 
The  current  mirror  mismatch  can  be  obtained  by  calculating  the  total  differential  of 
input  current.  Since  the  subthreshold  current  is  Ip  = fCox{fn—^)Up{W / L)  exp[(V^5  — 

we  can  estimate  the  current  mirror  mismatch  as 


A/ 


D 


dip  , (W 
d{W!L)^\L 
A(iy/L) 

— Ip 


dl 
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d{Vgs  - Vtn) 

AVth 


A {Vgs  - Vth) 


W/L  Ut 

where  mismatches  in  fCoxijfi  — 1)  are  ignored.  It  yields 


(B.l) 


A/p  A(H//L)  AU, 

Ip  W/L  Ut  ■ 

This  result  suggests  that,  to  minimize  current  mismatch,  W and  L must  be  max- 
imized, and  then  the  threshold  mismatch  will  eventually  limit  the  current  mirror 
performance.  In  contrast,  for  a mirror  working  in  the  above-threshold  region,  the 
current  mismatch  is  given  by 


A/p  ^ A(H//L)  _ AKa 

Id  WIL  (V„  - K»)/2  ' ■ ’ 

The  above  equation  indicates  that  we  can  reduce  the  current  mirror  mismatch  by 
pushing  the  mirror  pair  into  the  above-threshold  region  if  {Vgg  — Vth)l^  > Up-  In 
our  TTFS-CM  imager  design,  we  need  to  intentionally  use  long  channel  MOSFETs 
to  ensure  the  current  mirrors  operate  in  the  above-threshold  region  and  maximize 
[Vgs  - Vth)l‘^  as  well. 
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